From kbarrett at openjdk.org Sat Nov 1 07:34:03 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 1 Nov 2025 07:34:03 GMT Subject: RFR: 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library [v2] In-Reply-To: <4Yh4KeUItjdDYihe9d1u66tBROofb0pRLDKIdhw-XZo=.cf383363-c963-4871-941d-5e358f74c048@github.com> References: <6u0Ia6A2xQnv51Ti0Jchg6rnncfRRvtK5oGS963CeIA=.bc9c1481-60ad-443d-923b-dc86277dbb14@github.com> <4Yh4KeUItjdDYihe9d1u66tBROofb0pRLDKIdhw-XZo=.cf383363-c963-4871-941d-5e358f74c048@github.com> Message-ID: On Mon, 27 Oct 2025 08:24:41 GMT, Florian Weimer wrote: >> We (you and me, @fweimer-rh) discussed this a couple of years ago: >> https://mail.openjdk.org/pipermail/hotspot-dev/2023-December/082324.html >> >> Quoting from here: >> https://mail.openjdk.org/pipermail/hotspot-dev/2023-December/083142.html >> >> " >> Empirically, a recursive initialization attempt doesn't make any attempt to >> throw. Rather, it blocks forever waiting for a futex signal from a thread that >> succeeds in the initialization. Which of course will never come. >> >> And that makes sense, now that I've looked at the code. >> >> In __cxa_guard_acquire, with _GLIBCXX_USE_FUTEX, if the guard indicates >> initialization hasn't yet been completed, then it goes into a while loop. >> This while loop tries to claim initialization. Failing that, it checks >> whether initialization is complete. Failing that, it does a SYS_futex >> syscall, waiting for some other thread to perform the initialization. There's >> nothing there to check for recursion. >> >> throw_recursive_init_exception is only called if single-threaded (either by >> configuration or at runtime). >> " >> >> It doesn't look like there have been any relevant changes in that area since >> then. So I think there is still not a problem here. > > @kimbarrett Sorry, I forgot about the old thread. You can get the exception in a single-threaded scenario, something like this: > > > struct S { > S() { > static S s; > *this = s; > } > } global; > > > Maybe the actual rule is more like this? > >> Functions that may throw exceptions must not be used, unless individual calls ensure that these particular invocations cannot throw exceptions. Recursively entering a block-scoped static is undefined behavior. That some configurations of glibc might throw an exception in that situation (even despite the caller being compiled with exceptions disabled) seems like a mistake in glibc, and not really our concern. Our code should avoid such a situation because it's UB, regardless of whether the actual behavior involves exceptions or nasal demons. The exception only gets thrown when the application is single-threaded. But at least the common way to start java (via the launcher) is already multi-threaded on entry to Threads::create_vm(). So that case doesn't normally apply to us anyway. Also, I really don't think we want people trying to figure out whether a particular call might or might not throw (neither when writing nor when reading code). So no, I don't think the proposed rule should be changed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27601#discussion_r2483178071 From aph at openjdk.org Sat Nov 1 09:02:11 2025 From: aph at openjdk.org (Andrew Haley) Date: Sat, 1 Nov 2025 09:02:11 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v9] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Tue, 28 Oct 2025 23:18:48 GMT, Ruben wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains three new commits since the last revision: > > - Address review comments and fix a mistype > - Check for NOP and MOVK separately in NativePostCallNop > - Test for deoptimization in virtual threads > > Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a Marked as reviewed by aph (Reviewer). But I'm not going to push `SafeFetch` any further. I've said my piece. ------------- PR Review: https://git.openjdk.org/jdk/pull/26678#pullrequestreview-3407023975 PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3476011689 From aph at openjdk.org Sat Nov 1 09:05:08 2025 From: aph at openjdk.org (Andrew Haley) Date: Sat, 1 Nov 2025 09:05:08 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v9] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 30 Oct 2025 21:10:18 GMT, Dean Long wrote: > Re: SafeFetch, it is probably OK to make NativePostCallNop_at slightly slower for uses like make_deoptimized(), but the oopmap optimizations like CodeCache::find_blob_and_oopmap() were highly optimized to make loom/VirtualThread performance reasonable. Adding a SafeFetch here might cause a regression. Sure, but 2 things: Loom doesn't meed post-call NOPs as much as it used to. We could fairly easily make SafeFetch much faster than it is, if needs be. But anyway, I approved this patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3476017720 From fandreuzzi at openjdk.org Sat Nov 1 11:48:04 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Sat, 1 Nov 2025 11:48:04 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: Message-ID: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> On Thu, 30 Oct 2025 06:01:48 GMT, Erik Gahlin wrote: > It would be good if you could provide some ballpark figures on the number of events in a worst-case scenario, so we can determine what GC level is appropriate. I wrote a simple pathological test with multiple threads interning random strings, [this](https://github.com/user-attachments/files/23282465/out-parallel.txt) is the worst I've seen: 100 deduplication rounds within `3.698s` and `3.729s`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3476289368 From kvn at openjdk.org Sat Nov 1 15:06:06 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 1 Nov 2025 15:06:06 GMT Subject: RFR: 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library [v4] In-Reply-To: References: Message-ID: On Sat, 18 Oct 2025 17:11:48 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to suggest that C++ >> Standard Library components may be used, after appropriate vetting and >> discussion, rather than just a blanket "no, don't use it" with a few very >> narrow exceptions. It provides some guidance on that vetting process and >> the criteria to use, along with usage patterns. >> >> In particular, it proposes that Standard Library headers should not be >> included directly, but instead through HotSpot-provided wrapper headers. This >> gives us a place to document usage, provide workarounds for platform issues in >> a single place, and so on. >> >> Such wrapper headers are provided by this PR for ``, ``, and >> ``, along with updates to use them. I have a separate change for >> `` that I plan to propose later, under JDK-8369187. There will be >> additional followups for other C compatibility headers besides ``. >> >> This PR also cleans up some nomenclature issues around forbid vs exclude and >> the like. >> >> Testing: mach5 tier1-5, GHA sanity tests > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - Merge branch 'master' into stdlib-header-wrappers > - Merge branch 'master' into stdlib-header-wrappers > - Merge branch 'master' into stdlib-header-wrappers > - jrose comments > - move tuple to undecided category > - add wrapper for > - add wrapper for > - add wrapper for > - style guide permits some standard library facilities Approved ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27601#pullrequestreview-3407565447 From duke at openjdk.org Sat Nov 1 16:00:22 2025 From: duke at openjdk.org (duke) Date: Sat, 1 Nov 2025 16:00:22 GMT Subject: Withdrawn: 8366122: Shenandoah: Implement efficient support for object count after gc events In-Reply-To: <7KUQJooZsasGtVU-HaCj7h8_rMFBX13d4yW3T4PfpBw=.07a12734-af51-45cc-9bbb-d6573806478a@github.com> References: <7KUQJooZsasGtVU-HaCj7h8_rMFBX13d4yW3T4PfpBw=.07a12734-af51-45cc-9bbb-d6573806478a@github.com> Message-ID: On Thu, 28 Aug 2025 01:30:39 GMT, pf0n wrote: > ### Summary > > The new implementation of ObjectCountAfterGC for Shenandoah piggybacks off of the existing marking phases and records strongly marked objects in a histogram. If the event is disabled, the original marking closures are used. When enabled new mark-and-count closures are used by the worker threads. Each worker thread updates its local histogram as it marks an object. These local histograms are merged at the conclusion of the marking phase under a mutex. The event is emitted outside a safepoint. Because (most) Shenandoah's marking is done concurrently, so is the object counting work. > > ### Performance > The performance test were ran using the Extremem benchmark on a default and stress workload. (will edit this section to include data after average time and test for GenShen) > > #### Default workload: > ObjectCountAfterGC disabled (master branch): > `[807.216s][info][gc,stats ] Pause Init Mark (G) = 0.003 s (a = 264 us)` > `[807.216s][info][gc,stats ] Pause Init Mark (N) = 0.001 s (a = 91 us)` > `[807.216s][info][gc,stats ] Concurrent Mark Roots = 0.041 s (a = 4099 us)` > `[807.216s][info][gc,stats ] Concurrent Marking = 1.660 s (a = 166035 us)` > `[807.216s][info][gc,stats ] Pause Final Mark (G) = 0.004 s (a = 446 us) ` > `[807.216s][info][gc,stats ] Pause Final Mark (G) = 0.004 s (a = 446 us) ` > `[807.216s][info][gc,stats ] Pause Final Mark (N) = 0.004 s (a = 357 us)` > > ObjectCountAfterGC disabled (feature branch): > `[807.104s][info][gc,stats ] Pause Init Mark (G) = 0.003 s (a = 302 us)` > `[807.104s][info][gc,stats ] Pause Init Mark (N) = 0.001 s (a = 92 us) ` > `[807.104s][info][gc,stats ] Concurrent Mark Roots = 0.048 s (a = 4827 us)` > `[807.104s][info][gc,stats ] Concurrent Marking = 1.666 s (a = 166638 us) ` > `[807.104s][info][gc,stats ] Pause Final Mark (G) = 0.006 s (a = 603 us)` > `[807.104s][info][gc,stats ] Pause Final Mark (N) = 0.005 s (a = 516 us)` > > ObjectCountAfterGC enabled (feature branch) > `[807.299s][info][gc,stats ] Pause Init Mark (G) = 0.002 s (a = 227 us)` > `[807.299s][info][gc,stats ] Pause Init Mark (N) = 0.001 s (a = 89 us) ` > `[807.299s][info][gc,stats ] Concurrent Mark Roots = 0.053 s (a = 5279 us)` > `[807.299s][info][gc,st... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/26977 From lucy at openjdk.org Sat Nov 1 20:56:02 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Sat, 1 Nov 2025 20:56:02 GMT Subject: RFR: 8370871: [s390x] consistently update top_frame_sp [v2] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 06:48:43 GMT, Amit Kumar wrote: >> Stores top_frame_sp for interpreter frames consistently. Although this is making one to assert fail in loom port but still I think this should be considered independent bug. >> >> Changes in `generate_throw_exception` also helped with another issue in loom which was resolved by this workaround: https://github.com/offamitkumar/jdk/blob/5cb1322f58a4c676aac59244dcb62d8b67f0ed93/src/hotspot/cpu/s390/continuationFreezeThaw_s390.inline.hpp#L302C1-L318C2, issue: >> >> In `BasicExt.java` testcase there are two subtests : >> ```java >> new Continuation3Frames(TestCaseVariants.THROW_HANDLED_EXCEPTION).runTestCase(4, compPolicy); >> new Continuation3Frames(TestCaseVariants.THROW_UNHANDLED_EXCEPTION).runTestCase(4, compPolicy); >> >> >> these two testcase were failing because we were overwriting the interpreter frame content as we have to make the native call which saves the register from r6 to r15. So at that time the workaround was necessary, but it turns out that using `top_frame_sp` to extend the frame will fix the overwriting issue. > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > adding assert and frame extension for exception Looks good. Consistency rules! ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28042#pullrequestreview-3407720708 From kbarrett at openjdk.org Sun Nov 2 07:05:24 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 2 Nov 2025 07:05:24 GMT Subject: RFR: 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library [v4] In-Reply-To: References: Message-ID: On Sat, 18 Oct 2025 17:11:48 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to suggest that C++ >> Standard Library components may be used, after appropriate vetting and >> discussion, rather than just a blanket "no, don't use it" with a few very >> narrow exceptions. It provides some guidance on that vetting process and >> the criteria to use, along with usage patterns. >> >> In particular, it proposes that Standard Library headers should not be >> included directly, but instead through HotSpot-provided wrapper headers. This >> gives us a place to document usage, provide workarounds for platform issues in >> a single place, and so on. >> >> Such wrapper headers are provided by this PR for ``, ``, and >> ``, along with updates to use them. I have a separate change for >> `` that I plan to propose later, under JDK-8369187. There will be >> additional followups for other C compatibility headers besides ``. >> >> This PR also cleans up some nomenclature issues around forbid vs exclude and >> the like. >> >> Testing: mach5 tier1-5, GHA sanity tests > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - Merge branch 'master' into stdlib-header-wrappers > - Merge branch 'master' into stdlib-header-wrappers > - Merge branch 'master' into stdlib-header-wrappers > - jrose comments > - move tuple to undecided category > - add wrapper for > - add wrapper for > - add wrapper for > - style guide permits some standard library facilities Thanks for reviews and comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27601#issuecomment-3477511265 From kbarrett at openjdk.org Sun Nov 2 07:05:26 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 2 Nov 2025 07:05:26 GMT Subject: Integrated: 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library In-Reply-To: References: Message-ID: On Thu, 2 Oct 2025 07:11:51 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide to suggest that C++ > Standard Library components may be used, after appropriate vetting and > discussion, rather than just a blanket "no, don't use it" with a few very > narrow exceptions. It provides some guidance on that vetting process and > the criteria to use, along with usage patterns. > > In particular, it proposes that Standard Library headers should not be > included directly, but instead through HotSpot-provided wrapper headers. This > gives us a place to document usage, provide workarounds for platform issues in > a single place, and so on. > > Such wrapper headers are provided by this PR for ``, ``, and > ``, along with updates to use them. I have a separate change for > `` that I plan to propose later, under JDK-8369187. There will be > additional followups for other C compatibility headers besides ``. > > This PR also cleans up some nomenclature issues around forbid vs exclude and > the like. > > Testing: mach5 tier1-5, GHA sanity tests This pull request has now been integrated. Changeset: e8a1a870 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/e8a1a8707ee6192c85ac62a2a51c815e07613c38 Stats: 670 lines in 68 files changed: 430 ins; 134 del; 106 mod 8369186: HotSpot Style Guide should permit some uses of the C++ Standard Library Reviewed-by: jrose, lkorinth, iwalulya, kvn, stefank ------------- PR: https://git.openjdk.org/jdk/pull/27601 From epeter at openjdk.org Mon Nov 3 06:58:19 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Nov 2025 06:58:19 GMT Subject: Integrated: 8370405: C2: mismatched store from MergeStores wrongly scalarized in allocation elimination In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 10:40:18 GMT, Emanuel Peter wrote: > Note: @oliviermattmann found this bug with his whitebox fuzzer. See also https://github.com/openjdk/jdk/pull/27991 > > **Analysis** > We run Escape Analysis, and see that a local array allocation could possibly be removed, we only have matching `StoreI` to the `int[]`. But there is one `StoreI` that is still in a loop, and so we wait with the actual allocation removal until later, hoping it may go away, or drop out of the loop. > During loop opts, the `StoreI` drops out of the loop, now there should be nothing in the way of allocation removal. > But now we run `MergeStores`, and merge two of the `StoreI` into a mismatched `StoreL`. > > Then, we eventually remove the allocation, but don't check again if any new mismatched store has appeared. > Instead of a `ConI`, we receive a `ConL`, for the first of the two merged `StoreI`. The second merged `StoreI` instead captures the state before the `StoreL`, and that is wrong. > > **Solution** > We should have some assert, that checks that the captured `field_val` corresponds to the expected `field_type`. > > But the real fix was suggested by @merykitty : apparently he just had a similar issue in Valhalla: > https://github.com/openjdk/valhalla/blame/60af17ff5995cfa5de075332355f7f475c163865/src/hotspot/share/opto/macro.cpp#L709-L713 > (the idea is to bail out of the elimination if any of the found stores are mismatched.) > > **Details** > > How the bad sequence develops, and which components are involved. > > 1) The `SafePoint` contains a `ConL` and 3 `ConI`. (Correct would have been 4 `ConI`) > > 6 ConI === 23 [[ 4 ]] #int:16777216 > 7 ConI === 23 [[ 4 ]] #int:256 > 8 ConI === 23 [[ 4 ]] #int:1048576 > 9 ConL === 23 [[ 4 ]] #long:68719476737 > 54 DefinitionSpillCopy === _ 27 [[ 16 12 4 ]] > 4 CallStaticJavaDirect === 47 29 30 26 32 33 0 34 0 54 9 8 7 6 [[ 5 3 52 ]] Static wrapper for: uncommon_trap(reason='unstable_if' action='reinterpret' debug_id='0') # void ( int ) C=0.000100 Test::test @ bci:38 (line 21) reexecute !jvms: Test::test @ bci:38 (line 21) > > > 2) This is then encoded into an `ObjectValue`. A `Type::Long` / `ConL` is converted into a `[int=0, long=ConL]` pair, see: > https://github.com/openjdk/jdk/blob/da7121aff9eccb046b82a75093034f1cdbd9b9e4/src/hotspot/share/opto/output.cpp#L920-L925 > If I understand it right, there zero is just a placeholder. > > And so we get: > > (rr) p sv->print_fields_on(tty) > Fields: 0, 68719476737, 1048576, 256, 16777216 > > We can see the `zero`, followed by the `ConL`, and then 3 `ConI`. > > This se... This pull request has now been integrated. Changeset: 09a047f0 Author: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/09a047f00c88d14505c42a966dedbc87b9be5bdf Stats: 375 lines in 5 files changed: 375 ins; 0 del; 0 mod 8370405: C2: mismatched store from MergeStores wrongly scalarized in allocation elimination Co-authored-by: Olivier Mattmann Co-authored-by: Quan Anh Mai Reviewed-by: kvn, qamai ------------- PR: https://git.openjdk.org/jdk/pull/27997 From epeter at openjdk.org Mon Nov 3 06:58:18 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Nov 2025 06:58:18 GMT Subject: RFR: 8370405: C2: mismatched store from MergeStores wrongly scalarized in allocation elimination [v2] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 07:03:51 GMT, Quan Anh Mai wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8370405-alloc-elimination-and-MergeStores >> - only verify primitive types >> - Apply suggestions from code review >> - more assert adjustment >> - ignore debug flag >> - id for tests, and fix up the assert >> - pass int for short slot >> - another test >> - improve test >> - wip new IR test >> - ... and 6 more: https://git.openjdk.org/jdk/compare/6dd1ad30...b6e032c2 > > Regardless, I think this patch makes sense. Bailing out of scalar elimination when we are doing it is better than when we are running EA, and we should generally try to do it if we can. @merykitty @vnkozlov Thanks for the review and discussion! @dougxc Thanks for checking for Graal and getting us a quick response :) And thanks to Olivier Mattmann <[olivier.mattmann at bluewin.ch](mailto:olivier.mattmann at bluewin.ch)> for finding the bug! @mhaessig I decided to file this RFE, in case someone wants to invest time in it: [JDK-8371122](https://bugs.openjdk.org/browse/JDK-8371122) C2 Allocation Elimination: handle some mismatched accesses to arrays ------------- PR Comment: https://git.openjdk.org/jdk/pull/27997#issuecomment-3479146291 From tschatzl at openjdk.org Mon Nov 3 07:31:37 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 3 Nov 2025 07:31:37 GMT Subject: RFR: 8369111: G1: Determining concurrent start uses inconsistent predicates [v2] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that fixes an inconsistency between requesting a concurrent start garbage collection during humongous object allocation and then actually starting it. > > I.e. in `G1CollectedHeap::attempt_allocation_humongous` we check whether the allocation would cross the IHOP threshold taking the current allocation into account, and if so, see if G1 should start a concurrent marking, eventually starting a GC pause. > > That GC pause did not take the prospective allocation into account, so we could do that GC for nothing (i.e. not start a concurrent marking although we already knew that the allocation would cause one). > > This, in conjunction with JDK-8368959 can cause hundreds of extra GCs for the test in the CR (without eager reclaim of humongous arrays with references); otherwise it could cause the marking starting too late. > > There is a second bug in the calculation whether G1 crossed the threshold: for humongous objects it only takes the actual size into account, not the size that is needed for allocating it. The same issue existed for determining to start a concurrent mark after any other collection too. > > The change also tries to unify naming of the parameter to pass the allocation size (`alloc_word_size` -> `allocation_word_size`) and the parameter order where this size is passed along in multiple related methods. > > Testing: mentioned test case now behaving correctly, tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * walulyai review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27789/files - new: https://git.openjdk.org/jdk/pull/27789/files/c8aff5cb..b42f9b20 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27789&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27789&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27789.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27789/head:pull/27789 PR: https://git.openjdk.org/jdk/pull/27789 From jsikstro at openjdk.org Mon Nov 3 07:53:38 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 3 Nov 2025 07:53:38 GMT Subject: RFR: 8370345: Parallel: Rework TLAB accounting in MutableNUMASpace [v4] In-Reply-To: References: Message-ID: > Hello, > > Parallel's MutableNUMASpace is the only GC interface that uses the Thread parameter passed through the general CollectedHeap interface to tlab_capacity, tlab_used, and unsafe_max_tlab_alloc. It would be nice if Parallel's MutableNUMASpace could do without the Thread and instead find a thread-agnostic approach. By removing the need for the thread, it becomes possible to clean up the shared CollectedHeap interface, which makes it easier to read and maintain all GCs. Also, the lgrp_id that is stored in the Thread class should really have been moved to GCThreadLocalData after that concept was created, but with a thread-agnostic approach, the field can be removed entirely. > > The current solution is not without problems. When a new allocation is made inside one of the LGRP spaces in MutableNUMASpace using cas_allocate(), the NUMA/LGRP id is polled and stored inside the Thread, and we only attempt to allocate on that LGRP. If allocation fails on the local LGRP, we do not try to allocate on any other (remote) LGRP(s). This fact is reflected in the TLAB accounting methods tlab_capacity, tlab_used, and unsafe_max_tlab_alloc, which only check how much memory is used, etc., for the LGRP matching the stored LGRP id in the Thread. This model breaks down when threads are allowed to migrate between different CPUs, and therefore also NUMA nodes, which might change the LGRP id. > > For example, a system with two NUMA nodes gives us two LGRPs with ids 0 and 1. If a thread allocates most of its memory on LGRP 0 and then migrates to a CPU on LGRP 1, the thread will show that it allocated a significant amount of memory, but the used memory on the LGRP it is currently on could be very low. This would give a disproportionate allocation fraction. This is not a problem as the TLAB code accounts for this, but for a different reason entirely. The other way around could also be problematic. If a thread allocates very little memory on LGRP 0 and then migrates to LGRP 1, where another thread has allocated a lot of memory, the allocation fraction will be very low, when it could have a really high fraction if accounting for the used memory on its original LGRP. > > A solution to both of these issues is to average the capacity, used, and available memory across all LGRPs for the TLAB accounting methods. This approach provides a more accurate and stable view of memory usage and availability, regardless of thread migration or imbalances in NUMA/LGRP allocation. However, there are trade-offs... Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into JDK-8370345_mutablenumaspace_tlab_accounting - Comment on unsafe_max_tlab_alloc alignment - unsafe_max_tlab_alloc must be aligned to MinObjAlignmentInBytes - 8370345: Parallel: Rework TLAB accounting in MutableNUMASpace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27935/files - new: https://git.openjdk.org/jdk/pull/27935/files/c4cd91a8..0e79d823 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27935&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27935&range=02-03 Stats: 62841 lines in 832 files changed: 34797 ins; 23295 del; 4749 mod Patch: https://git.openjdk.org/jdk/pull/27935.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27935/head:pull/27935 PR: https://git.openjdk.org/jdk/pull/27935 From azafari at openjdk.org Mon Nov 3 09:16:05 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 3 Nov 2025 09:16:05 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v3] In-Reply-To: <_zOzryyrcX3PRWGo7osHOKa9hHfWTsKUi_Sj_bhUMLo=.9c7313b8-268e-41b9-8567-325449c24784@github.com> References: <_zOzryyrcX3PRWGo7osHOKa9hHfWTsKUi_Sj_bhUMLo=.9c7313b8-268e-41b9-8567-325449c24784@github.com> Message-ID: <_5iAdylxNdYJ1Uq-_pSz1NUona3Io0P8Nt8MW8lN3Ao=.8264e2bc-c0c2-40ab-82da-2a6578f6ad4d@github.com> On Mon, 15 Sep 2025 10:20:32 GMT, Kim Barrett wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> post-cond > > src/hotspot/share/oops/klass.hpp line 515: > >> 513: >> 514: // Want a pattern to quickly diff against layout header in register >> 515: // find something less clever! > > Comment needs to be updated. It should describe what is being calculated, and the 2nd line > is presumably resolved by this change. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2485780314 From iwalulya at openjdk.org Mon Nov 3 09:29:10 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 3 Nov 2025 09:29:10 GMT Subject: RFR: 8369111: G1: Determining concurrent start uses inconsistent predicates [v2] In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 07:31:37 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that fixes an inconsistency between requesting a concurrent start garbage collection during humongous object allocation and then actually starting it. >> >> I.e. in `G1CollectedHeap::attempt_allocation_humongous` we check whether the allocation would cross the IHOP threshold taking the current allocation into account, and if so, see if G1 should start a concurrent marking, eventually starting a GC pause. >> >> That GC pause did not take the prospective allocation into account, so we could do that GC for nothing (i.e. not start a concurrent marking although we already knew that the allocation would cause one). >> >> This, in conjunction with JDK-8368959 can cause hundreds of extra GCs for the test in the CR (without eager reclaim of humongous arrays with references); otherwise it could cause the marking starting too late. >> >> There is a second bug in the calculation whether G1 crossed the threshold: for humongous objects it only takes the actual size into account, not the size that is needed for allocating it. The same issue existed for determining to start a concurrent mark after any other collection too. >> >> The change also tries to unify naming of the parameter to pass the allocation size (`alloc_word_size` -> `allocation_word_size`) and the parameter order where this size is passed along in multiple related methods. >> >> Testing: mentioned test case now behaving correctly, tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > * walulyai review Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27789#pullrequestreview-3410126117 From azafari at openjdk.org Mon Nov 3 09:29:28 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 3 Nov 2025 09:29:28 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v4] In-Reply-To: References: Message-ID: <8F0nIHGwWbZ0Z6oRxP6rXYoS-GRQEh3-LuiCa2RGvfk=.553e56b9-93cb-4aac-b89e-4ebb2f1e2169@github.com> > Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. > > Tests: > mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: comments and post-cond ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27288/files - new: https://git.openjdk.org/jdk/pull/27288/files/32db16d4..8a3d6d13 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/27288.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27288/head:pull/27288 PR: https://git.openjdk.org/jdk/pull/27288 From jsikstro at openjdk.org Mon Nov 3 09:34:23 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 3 Nov 2025 09:34:23 GMT Subject: RFR: 8370345: Parallel: Rework TLAB accounting in MutableNUMASpace [v4] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 11:38:17 GMT, Albert Mingkun Yang wrote: >> Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8370345_mutablenumaspace_tlab_accounting >> - Comment on unsafe_max_tlab_alloc alignment >> - unsafe_max_tlab_alloc must be aligned to MinObjAlignmentInBytes >> - 8370345: Parallel: Rework TLAB accounting in MutableNUMASpace > > Marked as reviewed by ayang (Reviewer). Thank you for the reviews! @albertnetymk @walulyai I re-ran testing locally tier1-3 after merging which looks good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27935#issuecomment-3479622273 From jsikstro at openjdk.org Mon Nov 3 09:34:24 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 3 Nov 2025 09:34:24 GMT Subject: Integrated: 8370345: Parallel: Rework TLAB accounting in MutableNUMASpace In-Reply-To: References: Message-ID: On Wed, 22 Oct 2025 08:56:22 GMT, Joel Sikstr?m wrote: > Hello, > > Parallel's MutableNUMASpace is the only GC interface that uses the Thread parameter passed through the general CollectedHeap interface to tlab_capacity, tlab_used, and unsafe_max_tlab_alloc. It would be nice if Parallel's MutableNUMASpace could do without the Thread and instead find a thread-agnostic approach. By removing the need for the thread, it becomes possible to clean up the shared CollectedHeap interface, which makes it easier to read and maintain all GCs. Also, the lgrp_id that is stored in the Thread class should really have been moved to GCThreadLocalData after that concept was created, but with a thread-agnostic approach, the field can be removed entirely. > > The current solution is not without problems. When a new allocation is made inside one of the LGRP spaces in MutableNUMASpace using cas_allocate(), the NUMA/LGRP id is polled and stored inside the Thread, and we only attempt to allocate on that LGRP. If allocation fails on the local LGRP, we do not try to allocate on any other (remote) LGRP(s). This fact is reflected in the TLAB accounting methods tlab_capacity, tlab_used, and unsafe_max_tlab_alloc, which only check how much memory is used, etc., for the LGRP matching the stored LGRP id in the Thread. This model breaks down when threads are allowed to migrate between different CPUs, and therefore also NUMA nodes, which might change the LGRP id. > > For example, a system with two NUMA nodes gives us two LGRPs with ids 0 and 1. If a thread allocates most of its memory on LGRP 0 and then migrates to a CPU on LGRP 1, the thread will show that it allocated a significant amount of memory, but the used memory on the LGRP it is currently on could be very low. This would give a disproportionate allocation fraction. This is not a problem as the TLAB code accounts for this, but for a different reason entirely. The other way around could also be problematic. If a thread allocates very little memory on LGRP 0 and then migrates to LGRP 1, where another thread has allocated a lot of memory, the allocation fraction will be very low, when it could have a really high fraction if accounting for the used memory on its original LGRP. > > A solution to both of these issues is to average the capacity, used, and available memory across all LGRPs for the TLAB accounting methods. This approach provides a more accurate and stable view of memory usage and availability, regardless of thread migration or imbalances in NUMA/LGRP allocation. However, there are trade-offs... This pull request has now been integrated. Changeset: 10ea585b Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/10ea585b5ca01dc0136fe76a11109d0f17828772 Stats: 71 lines in 5 files changed: 24 ins; 23 del; 24 mod 8370345: Parallel: Rework TLAB accounting in MutableNUMASpace Reviewed-by: ayang, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/27935 From duke at openjdk.org Mon Nov 3 09:41:26 2025 From: duke at openjdk.org (Ruben) Date: Mon, 3 Nov 2025 09:41:26 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v9] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Sat, 1 Nov 2025 09:01:58 GMT, Andrew Haley wrote: >> Re: SafeFetch, it is probably OK to make NativePostCallNop_at slightly slower for uses like make_deoptimized(), but the oopmap optimizations like CodeCache::find_blob_and_oopmap() were highly optimized to make loom/VirtualThread performance reasonable. Adding a SafeFetch here might cause a regression. > >> Re: SafeFetch, it is probably OK to make NativePostCallNop_at slightly slower for uses like make_deoptimized(), but the oopmap optimizations like CodeCache::find_blob_and_oopmap() were highly optimized to make loom/VirtualThread performance reasonable. Adding a SafeFetch here might cause a regression. > > Sure, but 2 things: > Loom doesn't meed post-call NOPs as much as it used to. > We could fairly easily make SafeFetch much faster than it is, if needs be. > But anyway, I approved this patch. Thank you for the detailed advice, @theRealAph, I now see how `SafeFetch` can be valuable independently of whether false-positive matches with the post-call NOP pattern can happen during normal execution. I hadn't considered the stack corruption use case before. Reviewing the `SafeFetch` implementation, I believe in general case it relies on `sigsetjmp` on POSIX systems and exceptions on Windows. However, for AArch64, the `SafeFetch32` has an optimized implementation - avoiding `setjmp` or exceptions overhead. On the fast path, it performs just one load, so any extra performance cost would be due to that path cannot currently be inlined. There indeed seems to be a way to have it inlined, at least on Linux - via creating an extra ELF section containing addresses of all inlined `SafeFetch` loads and corresponding continuation points, which the signal handler can iterate through. I've not prototyped this, but if feasible, it could make the performance impact of using `SafeFetch` negligible. Since there isn't necessarily a consensus at this stage on whether `SafeFetch` should be added in this PR, I'd propose opening a separate JBS ticket for it to avoid blocking merge of the exception handler stub code cleanup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3479653271 From fbredberg at openjdk.org Mon Nov 3 10:05:35 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 3 Nov 2025 10:05:35 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v4] In-Reply-To: References: Message-ID: > This is the last PR in a series of PRs (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)) to obsolete the LockingMode flag and related code. > > The main focus is to to unify `ObjectSynchronizer` and `LightweightSynchronizer`. > There used to be a number of "dispatch functions" to redirect calls depending on the setting of the `LockingMode` flag. > Since we now only have lightweight locking, there is no longer any need for those dispatch functions, so I removed them. > To remove the dispatch functions I renamed the corresponding lightweight functions and call them directly. > This ultimately led me to remove "lightweight" from the function names and go back to "fast" instead, just to avoid having some with, and some without the "lightweight" part of the name. > > This PR also include a small simplification of `ObjectSynchronizer::FastHashCode`. > > Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. > All other platforms (`arm`, `ppc`, `riscv`, `s390`) has been sanity checked using QEMU. Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer - Update two, after the review - Update after review - Small arm32 fix - Small include line fix - 8367982: Unify ObjectSynchronizer and LightweightSynchronizer ------------- Changes: https://git.openjdk.org/jdk/pull/27915/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27915&range=03 Stats: 2972 lines in 80 files changed: 1259 ins; 1425 del; 288 mod Patch: https://git.openjdk.org/jdk/pull/27915.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27915/head:pull/27915 PR: https://git.openjdk.org/jdk/pull/27915 From alanb at openjdk.org Mon Nov 3 11:05:05 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 3 Nov 2025 11:05:05 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v4] In-Reply-To: References: Message-ID: On Wed, 29 Oct 2025 21:07:51 GMT, Mat Carter wrote: >> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. >> >> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE >> >> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: >> >> TRUE >> FALSE >> >> Passes tier1 on linux (x64) and windows (x64) > > Mat Carter has updated the pull request incrementally with one additional commit since the last revision: > > Updated test based on comments src/jdk.management/share/classes/jdk/management/HotSpotAOTCacheMXBean.java line 104: > 102: * > 103: * > 104: * @return {@code true} if a recording was in progress and has been ended successfully; {@code false} otherwise. Someone is bound to ask what happens if the "endRecording" operation is performed concurrently and there is recording in progress. Does one or all return true? I don't think it matters, the bigger issue here is that returning false means the recording has already ended or it failed. If it failed, why did it fail? I realize the intention is to add some properties and further operations to this MXBean but I think it would be good to think through if starting with a boolean returning operation is going to be problematic in the future. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2486085650 From alanb at openjdk.org Mon Nov 3 11:05:07 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 3 Nov 2025 11:05:07 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v4] In-Reply-To: References: <3DZMFG5pUixBip4O18gylfQpcCOTFxcwwVTWahRMBYo=.c9cb089e-6031-4b77-bb4a-775ed6cac818@github.com> Message-ID: <8XzXNt3iOeijZtZWB_zdMoWLPadJkgEbmaoqZQjEH1A=.a9fca7c1-9e42-4209-b21f-08af5554d344@github.com> On Wed, 29 Oct 2025 18:47:13 GMT, Mat Carter wrote: >> I see that now - fixing .... > > I also removed the nested {@code ..} from within the as that also caused an issue Good. You can move the example to a snippet too and that will allow the `
` tags to go away.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2486086959

From egahlin at openjdk.org  Mon Nov  3 11:46:04 2025
From: egahlin at openjdk.org (Erik Gahlin)
Date: Mon, 3 Nov 2025 11:46:04 GMT
Subject: RFR: 8369736 - Add management interface for AOT cache creation
 [v4]
In-Reply-To: 
References: 
 
Message-ID: <3w1fRC9HqhTnBzUPCsGGsw0q8H-wcY-km96h5W0S3To=.28f02299-e10b-4a20-a407-1a208e764a75@github.com>

On Wed, 29 Oct 2025 21:07:51 GMT, Mat Carter  wrote:

>> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated.
>> 
>> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE
>> 
>> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses:
>> 
>> TRUE
>> FALSE
>> 
>> Passes tier1 on linux (x64) and windows (x64)
>
> Mat Carter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Updated test based on comments

Can this be done using a diagnostic command, e.g. AOT.stop? It would allow the recording to be stopped from jcmd and the DiagnosticCommandMBean, without the need for a separate MXBean.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28010#issuecomment-3480099956

From tschatzl at openjdk.org  Mon Nov  3 14:47:39 2025
From: tschatzl at openjdk.org (Thomas Schatzl)
Date: Mon, 3 Nov 2025 14:47:39 GMT
Subject: RFR: 8369111: G1: Determining concurrent start uses inconsistent
 predicates [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 3 Nov 2025 09:26:49 GMT, Ivan Walulya  wrote:

>> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   * walulyai review
>
> Marked as reviewed by iwalulya (Reviewer).

Thanks @walulyai @albertnetymk for your reviews

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27789#issuecomment-3480915285

From tschatzl at openjdk.org  Mon Nov  3 14:47:41 2025
From: tschatzl at openjdk.org (Thomas Schatzl)
Date: Mon, 3 Nov 2025 14:47:41 GMT
Subject: Integrated: 8369111: G1: Determining concurrent start uses
 inconsistent predicates
In-Reply-To: 
References: 
Message-ID: 

On Tue, 14 Oct 2025 08:58:43 GMT, Thomas Schatzl  wrote:

> Hi all,
> 
>   please review this change that fixes an inconsistency between requesting a concurrent start garbage collection during humongous object allocation and then actually starting it.
> 
> I.e. in `G1CollectedHeap::attempt_allocation_humongous` we check whether the allocation would cross the IHOP threshold taking the current allocation into account, and if so, see if G1 should start a concurrent marking, eventually starting a GC pause.
> 
> That GC pause did not take the prospective allocation into account, so we could do that GC for nothing (i.e. not start a concurrent marking although we already knew that the allocation would cause one).
> 
> This, in conjunction with JDK-8368959 can cause hundreds of extra GCs for the test in the CR (without eager reclaim of humongous arrays with references); otherwise it could cause the marking starting too late.
> 
> There is a second bug in the calculation whether G1 crossed the threshold: for humongous objects it only takes the actual size into account, not the size that is needed for allocating it. The same issue existed for determining to start a concurrent mark after any other collection too.
> 
> The change also tries to unify naming of the parameter to pass the allocation size (`alloc_word_size` -> `allocation_word_size`) and the parameter order where this size is passed along in multiple related methods.
> 
> Testing: mentioned test case now behaving correctly, tier1-5
> 
> Thanks,
>   Thomas

This pull request has now been integrated.

Changeset: 18e8873c
Author:    Thomas Schatzl 
URL:       https://git.openjdk.org/jdk/commit/18e8873cadf3900139a6555d4a228148a10d2009
Stats:     111 lines in 9 files changed: 35 ins; 8 del; 68 mod

8369111: G1: Determining concurrent start uses inconsistent predicates

Reviewed-by: iwalulya, ayang

-------------

PR: https://git.openjdk.org/jdk/pull/27789

From iwalulya at openjdk.org  Mon Nov  3 15:41:15 2025
From: iwalulya at openjdk.org (Ivan Walulya)
Date: Mon, 3 Nov 2025 15:41:15 GMT
Subject: RFR: 8370774: Merge ModRefBarrierSet into CardTableBarrierSet
In-Reply-To: <4E4gACqFjlt5P5yJtCzgFFDk9GIamGuQriH4uoJX9Kc=.c9b53715-c582-4bef-8304-6d0e710cfcbd@github.com>
References: <4E4gACqFjlt5P5yJtCzgFFDk9GIamGuQriH4uoJX9Kc=.c9b53715-c582-4bef-8304-6d0e710cfcbd@github.com>
Message-ID: 

On Tue, 28 Oct 2025 09:20:46 GMT, Albert Mingkun Yang  wrote:

> Merge a class into its sole subclass.
> 
> Many files are changed in this PR, and they can be largely divided into two groups, moving content of `ModRefBarrierSet` into `CardTableBarrierSet` (`src/hotspot/share/gc/`) and platform specific moving content of `ModRefBarrierSetAssembler` into `CardTableBarrierSetAssembler` (`src/hotspot/cpu/`).
> 
> Test: tier1-5 for x64 and aarch64; GHA

LGTM!

-------------

Marked as reviewed by iwalulya (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/28013#pullrequestreview-3411621214

From duke at openjdk.org  Mon Nov  3 16:10:16 2025
From: duke at openjdk.org (Ruben)
Date: Mon, 3 Nov 2025 16:10:16 GMT
Subject: RFR: 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1
 volatile field loads [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Tue, 7 Oct 2025 10:14:48 GMT, Andrew Haley  wrote:

>> Samuel Chee has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - Address review comments
>>    
>>    Change-Id: Ica13be8094ac0f057066042ef0a5ec5927b98dfd
>>  - Refine code generation for mem2reg_volatile
>>    
>>    The patch is contributed by @theRealAph.
>>    
>>    Change-Id: I7ab1854dd238cdce72a4ab218b5b4ee84ad39586
>
>> See #27432
> 
> Now that this is integrated, you may proceed.
> 
> Will you also proceed with [8360654](https://github.com/openjdk/jdk/pull/26000)?

Hi @theRealAph,
I've pushed changes for this PR to a new branch https://github.com/openjdk/jdk/compare/master...ruben-arm:jdk:pr-8365147 as Samuel is currently not available. Once he is back, he can update this PR's branch.
In the meanwhile, I'm planning to run more of the `jcstress` testing. I'd appreciate your feedback on the version in the new branch.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26748#issuecomment-3481304989

From fbredberg at openjdk.org  Mon Nov  3 16:24:32 2025
From: fbredberg at openjdk.org (Fredrik Bredberg)
Date: Mon, 3 Nov 2025 16:24:32 GMT
Subject: RFR: 8369238: Allow virtual thread preemption on some common class
 initialization paths [v12]
In-Reply-To: 
References: 
 
Message-ID: 

On Thu, 30 Oct 2025 15:54:18 GMT, Patricio Chilano Mateo  wrote:

>> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized.
>> 
>> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too.
>> 
>> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`.
>> 
>> ### Summary of implementation
>> 
>> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too.
>> 
>> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). 
>> 
>> ### Notes
>> 
>> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon...
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Improve comment and assert msg

A truly useful enhancement! Just had a few questions / suggestions.

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 762:

> 760:   load_const_optimized(bad, 0xbad0101babe00000);
> 761:   for (uint32_t i = 1; i < (sizeof(regs) / sizeof(Register)); i++) {
> 762:     addi(regs[i], regs[0], regs[i]->encoding());

Guess it's a question for @reinrich, but why set up `bad = regs[0]` and then still use `regs[0]` instead of `bad`? 
I think using `bad` would make the code easier to understand than using `regs[0]`.
Suggestion:

    addi(regs[i], bad, regs[i]->encoding());

src/hotspot/share/interpreter/linkResolver.cpp line 1689:

> 1687:   EXCEPTION_MARK;
> 1688:   CallInfo info;
> 1689:   resolve_static_call(info, link_info, ClassInitMode::dont_init, THREAD);

Couldn't you just do `CHECK_AND_CLEAR_NULL` and skip the following `if (HAS_PENDING_EXCEPTION)` statement?

Suggestion:

  resolve_static_call(info, link_info, ClassInitMode::dont_init, CHECK_AND_CLEAR_NULL);

I see the same in functions both above and below this one, is there any reason for this?

-------------

Marked as reviewed by fbredberg (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27802#pullrequestreview-3411212227
PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487050754
PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2486618754

From pchilanomate at openjdk.org  Mon Nov  3 16:49:12 2025
From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
Date: Mon, 3 Nov 2025 16:49:12 GMT
Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access
 [v5]
In-Reply-To: 
References: 
 
Message-ID: 

On Thu, 30 Oct 2025 22:47:41 GMT, Jorn Vernee  wrote:

>> See the JBS issue for a problem description.
>> 
>> This patch changes the shared scope closure handshake to be able to handle 'arbitrary' Java frames on the stack during a scoped memory access.
>> 
>> For the purposes of this change, we assume that 'arbitrary' is limited to the following: 
>> 1. Frames added by calling the constructor of `InternalError` as a result of a faulting access to a truncated memory-mapped file (see `HandshakeState::handle_unsafe_access_error`). This is the only handshake operation (i.e. may be triggered during a scoped access) that calls back into Java.
>> 2. Frames added by a JVMTI agent that calls back into Java code while handling a JVMTI event that happens during a scoped access.
>> 3. Any other Java code that runs as part of the linking process.
>> 
>> For (1), we set a flag while we are create the `InternalError`. If a thread has that flag set, we know it is in the process of crashing already, so we don't have to inspect the stack at all. For (2), all bets are off, so we have to walk the entire stack. For (3), this patch switches the hard limit of 10 frames for the stack walk to instead bail out at the first frame outside of the `java.base` module. In most cases this speeds up the stack walk as well, if threads are running other code.
>> 
>> The test `TestSharedCloseJFR` is added for scenario (1), and the test `TestSharedCloseJvmti` is added for scenario (2). Existing tests already cover scenario (3).
>> 
>> Testing: tier 1-4
>
> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update bug number in tests
>   
>   Co-authored-by: Chen Liang 

Fix looks good to me. Only have some comments about the JVMTI agent case, thanks.

src/hotspot/share/prims/scopedMemoryAccess.cpp line 53:

> 51: #endif
> 52: 
> 53:   bool agents_loaded = JvmtiAgentList::has_agents();

I see that for dynamically loaded agents we add to the list after loading the agent. Maybe we should check `JvmtiEnvBase::environments_might_exist()`?

src/hotspot/share/prims/scopedMemoryAccess.cpp line 94:

> 92: 
> 93: static bool is_accessing_session(JavaThread* jt, oop session, bool& in_scoped) {
> 94:   if (jt->is_throwing_unsafe_access_error()) {

If we assume arbitrary Java code in JVMTI callbacks this might return true but the target could be in a different nested scoped access. I think we should check we are in the no agent case before bailing out.

src/hotspot/share/runtime/javaThread.hpp line 1364:

> 1362:   JavaThread* _thread;
> 1363: public:
> 1364:   ThrowingUnsafeAccessError(JavaThread* thread) : _thread(thread) {

If we assume arbitrary Java code in JVMTI callbacks this could be executed recursively and `_throwing_unsafe_access_error` be set to false while we are within the outer caller. Although it?s fine since we will do a full stack walk in `is_accessing_session`, we should add a comment why this recursive case is okay (or save the old value as with `UnlockFlagSaver`).

-------------

PR Review: https://git.openjdk.org/jdk/pull/27919#pullrequestreview-3411931601
PR Review Comment: https://git.openjdk.org/jdk/pull/27919#discussion_r2487147887
PR Review Comment: https://git.openjdk.org/jdk/pull/27919#discussion_r2487150214
PR Review Comment: https://git.openjdk.org/jdk/pull/27919#discussion_r2487152847

From egahlin at openjdk.org  Mon Nov  3 17:02:33 2025
From: egahlin at openjdk.org (Erik Gahlin)
Date: Mon, 3 Nov 2025 17:02:33 GMT
Subject: RFR: 8037914: Add JFR event for string deduplication
In-Reply-To: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com>
References: 
 
 
 
 
 
 
 
 <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com>
Message-ID: 

On Sat, 1 Nov 2025 11:45:27 GMT, Francesco Andreuzzi  wrote:

> > It would be good if you could provide some ballpark figures on the number of events in a worst-case scenario, so we can determine what GC level is appropriate.
> 
> I wrote a simple pathological test with multiple threads interning random strings, [this](https://github.com/user-attachments/files/23282465/out-parallel.txt) is the worst I've seen: 100 deduplication rounds within `3.698s` and `3.729s`.

Thanks for investigating this. It doesn't sound that bad, the event could probably be enabled by default. The elapsed fields, are they the total since the JVM started or from the last round?

We typically try to avoid using "Bytes" in field names, since that information is already available in the content type. Perhaps something else could be used, newSize?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3481567424

From jvernee at openjdk.org  Mon Nov  3 17:27:51 2025
From: jvernee at openjdk.org (Jorn Vernee)
Date: Mon, 3 Nov 2025 17:27:51 GMT
Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access
 [v5]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 3 Nov 2025 16:40:05 GMT, Patricio Chilano Mateo  wrote:

>> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update bug number in tests
>>   
>>   Co-authored-by: Chen Liang 
>
> src/hotspot/share/prims/scopedMemoryAccess.cpp line 94:
> 
>> 92: 
>> 93: static bool is_accessing_session(JavaThread* jt, oop session, bool& in_scoped) {
>> 94:   if (jt->is_throwing_unsafe_access_error()) {
> 
> If we assume arbitrary Java code in JVMTI callbacks this might return true but the target could be in a different nested scoped access. I think we should check we are in the no agent case before bailing out.

My assumption is that once an unsafe access error is thrown, we don't expect execution to continue. I suppose it is technically possible to catch the exception, either in Java code or in the native agent code, and then try to continue execution, but the program would be in an undefined state at that point already. In other words, I don't think anyone should be continuing execution after this exception happens.

Although, it doesn't seem like a bad idea to keep walking here as a fail safe.

> src/hotspot/share/runtime/javaThread.hpp line 1364:
> 
>> 1362:   JavaThread* _thread;
>> 1363: public:
>> 1364:   ThrowingUnsafeAccessError(JavaThread* thread) : _thread(thread) {
> 
> If we assume arbitrary Java code in JVMTI callbacks this could be executed recursively and `_throwing_unsafe_access_error` be set to false while we are within the outer caller. Although it?s fine since we will do a full stack walk in `is_accessing_session`, we should add a comment why this recursive case is okay (or save the old value as with `UnlockFlagSaver`).

That's a good point. Since this is otherwise unrelated code, I'll make it safe for the re-entrant case as well.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27919#discussion_r2487300422
PR Review Comment: https://git.openjdk.org/jdk/pull/27919#discussion_r2487303301

From jvernee at openjdk.org  Mon Nov  3 17:30:46 2025
From: jvernee at openjdk.org (Jorn Vernee)
Date: Mon, 3 Nov 2025 17:30:46 GMT
Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access
 [v5]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 3 Nov 2025 16:39:23 GMT, Patricio Chilano Mateo  wrote:

>> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update bug number in tests
>>   
>>   Co-authored-by: Chen Liang 
>
> src/hotspot/share/prims/scopedMemoryAccess.cpp line 53:
> 
>> 51: #endif
>> 52: 
>> 53:   bool agents_loaded = JvmtiAgentList::has_agents();
> 
> I see that for dynamically loaded agents we add to the list after loading the agent. Maybe we should check `JvmtiEnvBase::environments_might_exist()`?

Ah, thanks for the suggestion. I thought there might have been a way to check this already, but I couldn't find it.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27919#discussion_r2487315255

From kbarrett at openjdk.org  Mon Nov  3 18:09:19 2025
From: kbarrett at openjdk.org (Kim Barrett)
Date: Mon, 3 Nov 2025 18:09:19 GMT
Subject: RFR: 8371104: gtests should use wrappers for  and
 
Message-ID: 

Please review this trivial change, updating HotSpot gtests to include the new
cppstdlib/{limits,type_traits}.hpp wrappers instead of including the Standard
Library headers directly.

Testing: mach5 tier1

-------------

Commit messages:
 - use type_traits wrapper
 - use limits wrapper

Changes: https://git.openjdk.org/jdk/pull/28114/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28114&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8371104
  Stats: 31 lines in 9 files changed: 11 ins; 20 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/28114.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/28114/head:pull/28114

PR: https://git.openjdk.org/jdk/pull/28114

From jvernee at openjdk.org  Mon Nov  3 18:29:43 2025
From: jvernee at openjdk.org (Jorn Vernee)
Date: Mon, 3 Nov 2025 18:29:43 GMT
Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access
 [v6]
In-Reply-To: 
References: 
Message-ID: 

> See the JBS issue for a problem description.
> 
> This patch changes the shared scope closure handshake to be able to handle 'arbitrary' Java frames on the stack during a scoped memory access.
> 
> For the purposes of this change, we assume that 'arbitrary' is limited to the following: 
> 1. Frames added by calling the constructor of `InternalError` as a result of a faulting access to a truncated memory-mapped file (see `HandshakeState::handle_unsafe_access_error`). This is the only handshake operation (i.e. may be triggered during a scoped access) that calls back into Java.
> 2. Frames added by a JVMTI agent that calls back into Java code while handling a JVMTI event that happens during a scoped access.
> 3. Any other Java code that runs as part of the linking process.
> 
> For (1), we set a flag while we are create the `InternalError`. If a thread has that flag set, we know it is in the process of crashing already, so we don't have to inspect the stack at all. For (2), all bets are off, so we have to walk the entire stack. For (3), this patch switches the hard limit of 10 frames for the stack walk to instead bail out at the first frame outside of the `java.base` module. In most cases this speeds up the stack walk as well, if threads are running other code.
> 
> The test `TestSharedCloseJFR` is added for scenario (1), and the test `TestSharedCloseJvmti` is added for scenario (2). Existing tests already cover scenario (3).
> 
> Testing: tier 1-4

Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision:

  Review comments Patricio

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/27919/files
  - new: https://git.openjdk.org/jdk/pull/27919/files/7a793468..9e7e59b8

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=27919&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27919&range=04-05

  Stats: 13 lines in 3 files changed: 5 ins; 3 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/27919.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/27919/head:pull/27919

PR: https://git.openjdk.org/jdk/pull/27919

From rrich at openjdk.org  Mon Nov  3 18:33:52 2025
From: rrich at openjdk.org (Richard Reingruber)
Date: Mon, 3 Nov 2025 18:33:52 GMT
Subject: RFR: 8369238: Allow virtual thread preemption on some common class
 initialization paths [v12]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 3 Nov 2025 16:11:38 GMT, Fredrik Bredberg  wrote:

>> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve comment and assert msg
>
> src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 762:
> 
>> 760:   load_const_optimized(bad, 0xbad0101babe00000);
>> 761:   for (uint32_t i = 1; i < (sizeof(regs) / sizeof(Register)); i++) {
>> 762:     addi(regs[i], regs[0], regs[i]->encoding());
> 
> Guess it's a question for @reinrich, but why set up `bad = regs[0]` and then still use `regs[0]` instead of `bad`? 
> I think using `bad` would make the code easier to understand than using `regs[0]`.
> Suggestion:
> 
>     addi(regs[i], bad, regs[i]->encoding());

Thanks for looking at the ppc part @fbredber 
Your suggestion is good. I think the loop should be reversed too. Then the addi after it can be removed.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487475002

From rrich at openjdk.org  Mon Nov  3 18:38:06 2025
From: rrich at openjdk.org (Richard Reingruber)
Date: Mon, 3 Nov 2025 18:38:06 GMT
Subject: RFR: 8369238: Allow virtual thread preemption on some common class
 initialization paths [v12]
In-Reply-To: 
References: 
 
Message-ID: 

On Thu, 30 Oct 2025 15:54:18 GMT, Patricio Chilano Mateo  wrote:

>> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized.
>> 
>> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too.
>> 
>> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`.
>> 
>> ### Summary of implementation
>> 
>> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too.
>> 
>> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). 
>> 
>> ### Notes
>> 
>> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon...
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Improve comment and assert msg

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 764:

> 762:     addi(regs[i], regs[0], regs[i]->encoding());
> 763:   }
> 764:   addi(regs[0], regs[0], regs[0]->encoding());

Based on @fbredber's suggestion:
Suggestion:

  load_const_optimized(bad, 0xbad0101babe00000);
  for (int i = (sizeof(regs) / sizeof(Register)) - 1; i >= 0; i--) {
    addi(regs[i], bad, regs[i]->encoding());
  }

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487483352

From pchilanomate at openjdk.org  Mon Nov  3 18:54:15 2025
From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
Date: Mon, 3 Nov 2025 18:54:15 GMT
Subject: RFR: 8369238: Allow virtual thread preemption on some common class
 initialization paths [v12]
In-Reply-To: 
References: 
 
 
Message-ID: <-FOyxYPMnMfseoEVE0gqhuWFKT2s04ZLcKvyKF28zwE=.2c7eef23-1b9a-4339-89c6-58117a625848@github.com>

On Mon, 3 Nov 2025 16:22:13 GMT, Fredrik Bredberg  wrote:

> A truly useful enhancement! Just had a few questions / suggestions.
>
Thanks for the review Fred!

> src/hotspot/share/interpreter/linkResolver.cpp line 1689:
> 
>> 1687:   EXCEPTION_MARK;
>> 1688:   CallInfo info;
>> 1689:   resolve_static_call(info, link_info, ClassInitMode::dont_init, THREAD);
> 
> Couldn't you just do `CHECK_AND_CLEAR_NULL` and skip the following `if (HAS_PENDING_EXCEPTION)` statement?
> 
> Suggestion:
> 
>   resolve_static_call(info, link_info, ClassInitMode::dont_init, CHECK_AND_CLEAR_NULL);
> 
> I see the same in functions both above and below this one, is there any reason for this?

Yes, I agree. I see there are a couple of instances of this pattern in this file as you point out, so if you are okay I?d prefer to file a separate bug to clean them all up together.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3482021139
PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487524814

From pchilanomate at openjdk.org  Mon Nov  3 19:03:07 2025
From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
Date: Mon, 3 Nov 2025 19:03:07 GMT
Subject: RFR: 8369238: Allow virtual thread preemption on some common class
 initialization paths [v13]
In-Reply-To: 
References: 
Message-ID: 

> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized.
> 
> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too.
> 
> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`.
> 
> ### Summary of implementation
> 
> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too.
> 
> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). 
> 
> ### Notes
> 
> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In...

Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:

  Suggested fix in macroAssembler_ppc.cpp

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/27802/files
  - new: https://git.openjdk.org/jdk/pull/27802/files/ffcd92a6..4dff05a8

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=12
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=11-12

  Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/27802.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/27802/head:pull/27802

PR: https://git.openjdk.org/jdk/pull/27802

From pchilanomate at openjdk.org  Mon Nov  3 19:03:08 2025
From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
Date: Mon, 3 Nov 2025 19:03:08 GMT
Subject: RFR: 8369238: Allow virtual thread preemption on some common class
 initialization paths [v12]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 3 Nov 2025 18:34:59 GMT, Richard Reingruber  wrote:

>> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve comment and assert msg
>
> src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 764:
> 
>> 762:     addi(regs[i], regs[0], regs[i]->encoding());
>> 763:   }
>> 764:   addi(regs[0], regs[0], regs[0]->encoding());
> 
> Based on @fbredber's suggestion:
> Suggestion:
> 
>   load_const_optimized(bad, 0xbad0101babe00000);
>   for (int i = (sizeof(regs) / sizeof(Register)) - 1; i >= 0; i--) {
>     addi(regs[i], bad, regs[i]->encoding());
>   }

Done.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2487538760

From pchilanomate at openjdk.org  Mon Nov  3 19:46:11 2025
From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
Date: Mon, 3 Nov 2025 19:46:11 GMT
Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access
 [v6]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 3 Nov 2025 18:29:43 GMT, Jorn Vernee  wrote:

>> See the JBS issue for a problem description.
>> 
>> This patch changes the shared scope closure handshake to be able to handle 'arbitrary' Java frames on the stack during a scoped memory access.
>> 
>> For the purposes of this change, we assume that 'arbitrary' is limited to the following: 
>> 1. Frames added by calling the constructor of `InternalError` as a result of a faulting access to a truncated memory-mapped file (see `HandshakeState::handle_unsafe_access_error`). This is the only handshake operation (i.e. may be triggered during a scoped access) that calls back into Java.
>> 2. Frames added by a JVMTI agent that calls back into Java code while handling a JVMTI event that happens during a scoped access.
>> 3. Any other Java code that runs as part of the linking process.
>> 
>> For (1), we set a flag while we are create the `InternalError`. If a thread has that flag set, we know it is in the process of crashing already, so we don't have to inspect the stack at all. For (2), all bets are off, so we have to walk the entire stack. For (3), this patch switches the hard limit of 10 frames for the stack walk to instead bail out at the first frame outside of the `java.base` module. In most cases this speeds up the stack walk as well, if threads are running other code.
>> 
>> The test `TestSharedCloseJFR` is added for scenario (1), and the test `TestSharedCloseJvmti` is added for scenario (2). Existing tests already cover scenario (3).
>> 
>> Testing: tier 1-4
>
> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments Patricio

Fix looks good to me, thanks!

-------------

Marked as reviewed by pchilanomate (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27919#pullrequestreview-3412633838

From pchilanomate at openjdk.org  Mon Nov  3 19:46:13 2025
From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
Date: Mon, 3 Nov 2025 19:46:13 GMT
Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access
 [v5]
In-Reply-To: 
References: 
 
 
 
Message-ID: 

On Mon, 3 Nov 2025 17:24:22 GMT, Jorn Vernee  wrote:

> My assumption is that once an unsafe access error is thrown, we don't expect execution to continue. I suppose it is technically possible to catch the exception, either in Java code or in the native agent code, and then try to continue execution, but the program would be in an undefined state at that point already. In other words, I don't think anyone should be continuing execution after this exception happens.
>
Yes, makes sense.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27919#discussion_r2487654019

From macarte at openjdk.org  Mon Nov  3 20:08:22 2025
From: macarte at openjdk.org (Mat Carter)
Date: Mon, 3 Nov 2025 20:08:22 GMT
Subject: RFR: 8369736 - Add management interface for AOT cache creation
 [v4]
In-Reply-To: <3w1fRC9HqhTnBzUPCsGGsw0q8H-wcY-km96h5W0S3To=.28f02299-e10b-4a20-a407-1a208e764a75@github.com>
References: 
 
 <3w1fRC9HqhTnBzUPCsGGsw0q8H-wcY-km96h5W0S3To=.28f02299-e10b-4a20-a407-1a208e764a75@github.com>
Message-ID: 

On Mon, 3 Nov 2025 11:42:16 GMT, Erik Gahlin  wrote:

> Can this be done using a diagnostic command, e.g. AOT.stop? It would allow the recording to be stopped from jcmd and the DiagnosticCommandMBean, without the need for a separate MXBean.

Thank you for the suggestion

To answer your first question, we do have a diagnostic command (AOT.end_recording) and it would precede the AOT MXBean into mainline and it's PR is here: https://github.com/openjdk/jdk/pull/27965

The longer goal for this MXBean is to provide additional methods that would aid in monitoring (isRecording, currentRecordingLength etc.), however we decided to reduce the scope of the MXBean for main line while we continue to test the monitoring functionality in leyden/premain

Historically the diagnostic command came after the MXBean in leyden/premain, however I decided to implement the diagnostic command with the necessary JVM hooks first to simplify review

So technically we could delay this PR and still have the required functionality in mainline, I'd like to hear from the other reviewers on this matter

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28010#issuecomment-3482353337

From eosterlund at openjdk.org  Mon Nov  3 21:10:00 2025
From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Mon, 3 Nov 2025 21:10:00 GMT
Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object
 Caching with Any GC [v7]
In-Reply-To: 
References: 
 
 
 
 
Message-ID: 

On Thu, 23 Oct 2025 16:06:11 GMT, Ioi Lam  wrote:

>>> Given that we know have support for CDS from all GCs is it time to replace all `INCLUDE_CDS_JAVA_HEAP` with just `INCLUDE_CDS`?
>> 
>> I think we could do that indeed. However, I would like that to be a follow-up cleanup, to avoid cluttering more files in this PR.
>
>> > Given that we know have support for CDS from all GCs is it time to replace all `INCLUDE_CDS_JAVA_HEAP` with just `INCLUDE_CDS`?
>> 
>> I think we could do that indeed. However, I would like that to be a follow-up cleanup, to avoid cluttering more files in this PR.
> 
> We have 
> 
> 
> #if INCLUDE_CDS && INCLUDE_G1GC && defined(_LP64)
> #define INCLUDE_CDS_JAVA_HEAP 1
> 
> 
> So we need to make sure it works for 32 bit as well.

@iklam does this look okay now?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27732#issuecomment-3482604226

From heidinga at openjdk.org  Mon Nov  3 21:35:01 2025
From: heidinga at openjdk.org (Dan Heidinga)
Date: Mon, 3 Nov 2025 21:35:01 GMT
Subject: RFR: 8369736 - Add management interface for AOT cache creation
 [v4]
In-Reply-To: 
References: 
 
 
Message-ID: <6LpN0Ae8kfDApFOPUYrbdqz2fRh8HVvvbCa8kiEEFS0=.31761484-c4c1-428c-9974-d2bdec3b587d@github.com>

On Mon, 3 Nov 2025 11:01:23 GMT, Alan Bateman  wrote:

>> Mat Carter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Updated test based on comments
>
> src/jdk.management/share/classes/jdk/management/HotSpotAOTCacheMXBean.java line 104:
> 
>> 102:        * 
>> 103: * >> 104: * @return {@code true} if a recording was in progress and has been ended successfully; {@code false} otherwise. > > Someone is bound to ask what happens if the "endRecording" operation is performed concurrently and there is recording in progress. Does one or all return true? I don't think it matters, the bigger issue here is that returning false means the recording has already ended or it failed. If it failed, why did it fail? I realize the intention is to add some properties and further operations to this MXBean but I think it would be good to think through if starting with a boolean returning operation is going to be problematic in the future. I see a couple of cases for when `endRecording` is called: 1) Within the same process to generate a cache - given the api will only return `true` to one caller, that caller is the only one who can be responsible for taking further action (copying the cache somewhere, etc). Already ended or failed makes no difference operationally. 2) From a monitoring process - again only the successful case matters. All failures (already ended or failed) are indistinguishable. No further action can be taken by the observer. It's only the success case that matters ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2487912197 From iklam at openjdk.org Tue Nov 4 00:09:15 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 4 Nov 2025 00:09:15 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v14] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 08:43:45 GMT, Erik ?sterlund wrote: >> This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. >> >> The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. >> >> This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. >> >> The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. >> >> The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. >> >> Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. >> Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the or... > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Comment update The updated version looks good to me. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27732#pullrequestreview-3413414459 From duke at openjdk.org Tue Nov 4 00:11:20 2025 From: duke at openjdk.org (duke) Date: Tue, 4 Nov 2025 00:11:20 GMT Subject: Withdrawn: 8354555: Add generic JFR events for TaskTerminator In-Reply-To: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> References: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> Message-ID: <6C63KrwSNS7jTc5SHtpknktCbt6kCAx12FM6NDEDPt8=.0da94a30-3bae-40ac-b4c0-a40b55876123@github.com> On Wed, 16 Apr 2025 08:24:15 GMT, Xiaolong Peng wrote: > The purpose of the PR is to add generic JFR events for TaskTerminator to track the attempts and timings that GC threads have tried to terminate GC tasks. > > Today only G1 emits JFR event with name `Termination` from [G1ParEvacuateFollowersClosure](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1YoungCollector.cpp#L555-L563), all other garbage collectors don't emit any JFR event for the termination attempt at all. > > By adding this, it gives performance engineers the visibility to the termination attempts and termination time when GC threads trying to finish GC tasks, we could build tool to analyze the jfr events to determine if there is potential data structure issue in application code, e.g. very large LinkedList or LinkedBlockingQueue. > > For the test, I have manually tested different GCs with Flight Recording enabled and verified the events: > G1: > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0108 ms > gcId = 0 > gcWorkerId = 8 > name = "Termination" > eventThread = "GC Thread#4" (osThreadId = 20483) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0467 ms > gcId = 0 > gcWorkerId = 2 > name = "Termination" > eventThread = "GC Thread#2" (osThreadId = 21251) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0474 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination" > eventThread = "GC Thread#8" (osThreadId = 36359) > } > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000834 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000166 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > > Shenandoah: > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0202 ms > gcId = 0 > gcWorkerId = 0 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#3" (osThreadId = 13827) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0205 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#1" (osThreadId = 14339) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0127 ms > gcId = 0 > gcWorkerId = 5 > name = "Termination: Final Mark" > eventThread = "Shenandoah G... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24676 From kbarrett at openjdk.org Tue Nov 4 04:29:12 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 4 Nov 2025 04:29:12 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v4] In-Reply-To: <8F0nIHGwWbZ0Z6oRxP6rXYoS-GRQEh3-LuiCa2RGvfk=.553e56b9-93cb-4aac-b89e-4ebb2f1e2169@github.com> References: <8F0nIHGwWbZ0Z6oRxP6rXYoS-GRQEh3-LuiCa2RGvfk=.553e56b9-93cb-4aac-b89e-4ebb2f1e2169@github.com> Message-ID: On Mon, 3 Nov 2025 09:29:28 GMT, Afshin Zafari wrote: >> Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. >> >> Tests: >> mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > comments and post-cond src/hotspot/share/oops/klass.hpp line 514: > 512: } > 513: > 514: // Find the right-most non-zero (e.g., ...1000) bit of the diff of array-of-boolean and array-of-byte layout helpers. Callers don't care whether it's the rightmost bit, only that it's a single bit. (Some callers use log2_exact to get the bit position.) So a more pedantically correct description might be something like "Return a value containing a single set bit that is in the bitset difference between the layout helpers for array-of-boolean and array-of-byte." src/hotspot/share/oops/klass.hpp line 525: > 523: // So use alternate form of negation to avoid warning. > 524: uint result = candidates & (~candidates + 1); > 525: assert(((result - 1) & result) == 0, "post-condition"); Use `power_of_2(result)`. For completeness, also consider checking other post-conditions - result is set in zlh and clear in blh. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2488596063 PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2488602736 From kbarrett at openjdk.org Tue Nov 4 04:29:13 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 4 Nov 2025 04:29:13 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v4] In-Reply-To: References: Message-ID: On Mon, 15 Sep 2025 19:10:14 GMT, Kim Barrett wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> comments and post-cond > > src/hotspot/share/oops/klass.hpp line 518: > >> 516: static int layout_helper_boolean_diffbit() { >> 517: uint zlh = checked_cast(array_layout_helper(T_BOOLEAN)); >> 518: uint blh = checked_cast(array_layout_helper(T_BYTE)); > > Use of check_cast is probably wrong. I think an alh is negative. Oops, my mistake. It probably doesn't fail currently because of [JDK-8314258](https://bugs.openjdk.org/browse/JDK-8314258). Note that by "my mistake" I meant it was a mistake to use `checked_cast` here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2488603639 From amitkumar at openjdk.org Tue Nov 4 05:02:17 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 4 Nov 2025 05:02:17 GMT Subject: RFR: 8370871: [s390x] consistently update top_frame_sp [v2] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 06:48:43 GMT, Amit Kumar wrote: >> Stores top_frame_sp for interpreter frames consistently. Although this is making one to assert fail in loom port but still I think this should be considered independent bug. >> >> Changes in `generate_throw_exception` also helped with another issue in loom which was resolved by this workaround: https://github.com/offamitkumar/jdk/blob/5cb1322f58a4c676aac59244dcb62d8b67f0ed93/src/hotspot/cpu/s390/continuationFreezeThaw_s390.inline.hpp#L302C1-L318C2, issue: >> >> In `BasicExt.java` testcase there are two subtests : >> ```java >> new Continuation3Frames(TestCaseVariants.THROW_HANDLED_EXCEPTION).runTestCase(4, compPolicy); >> new Continuation3Frames(TestCaseVariants.THROW_UNHANDLED_EXCEPTION).runTestCase(4, compPolicy); >> >> >> these two testcase were failing because we were overwriting the interpreter frame content as we have to make the native call which saves the register from r6 to r15. So at that time the workaround was necessary, but it turns out that using `top_frame_sp` to extend the frame will fix the overwriting issue. > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > adding assert and frame extension for exception Thanks for the reviews and approval. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28042#issuecomment-3483862621 From amitkumar at openjdk.org Tue Nov 4 05:02:18 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 4 Nov 2025 05:02:18 GMT Subject: Integrated: 8370871: [s390x] consistently update top_frame_sp In-Reply-To: References: Message-ID: On Wed, 29 Oct 2025 09:40:34 GMT, Amit Kumar wrote: > Stores top_frame_sp for interpreter frames consistently. Although this is making one to assert fail in loom port but still I think this should be considered independent bug. > > Changes in `generate_throw_exception` also helped with another issue in loom which was resolved by this workaround: https://github.com/offamitkumar/jdk/blob/5cb1322f58a4c676aac59244dcb62d8b67f0ed93/src/hotspot/cpu/s390/continuationFreezeThaw_s390.inline.hpp#L302C1-L318C2, issue: > > In `BasicExt.java` testcase there are two subtests : > ```java > new Continuation3Frames(TestCaseVariants.THROW_HANDLED_EXCEPTION).runTestCase(4, compPolicy); > new Continuation3Frames(TestCaseVariants.THROW_UNHANDLED_EXCEPTION).runTestCase(4, compPolicy); > > > these two testcase were failing because we were overwriting the interpreter frame content as we have to make the native call which saves the register from r6 to r15. So at that time the workaround was necessary, but it turns out that using `top_frame_sp` to extend the frame will fix the overwriting issue. This pull request has now been integrated. Changeset: 50bb92a3 Author: Amit Kumar URL: https://git.openjdk.org/jdk/commit/50bb92a33b32778a96b1823ff995889892bef890 Stats: 37 lines in 2 files changed: 32 ins; 0 del; 5 mod 8370871: [s390x] consistently update top_frame_sp Reviewed-by: rrich, lucy ------------- PR: https://git.openjdk.org/jdk/pull/28042 From eosterlund at openjdk.org Tue Nov 4 05:14:05 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 4 Nov 2025 05:14:05 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 00:05:59 GMT, Ioi Lam wrote: > The updated version looks good to me. Thank you for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27732#issuecomment-3483893308 From amitkumar at openjdk.org Tue Nov 4 05:20:15 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 4 Nov 2025 05:20:15 GMT Subject: RFR: 8371188: [s390x] Un-ProblemList TestUnreachableInnerLoop.java Message-ID: Trivial change, After [JDK-8288981](https://bugs.openjdk.org/browse/JDK-8288981) delivery test is now passing on s390x. So It can be removed from the problemlist. ------------- Commit messages: - remove Changes: https://git.openjdk.org/jdk/pull/28122/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28122&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371188 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28122/head:pull/28122 PR: https://git.openjdk.org/jdk/pull/28122 From sgehwolf at openjdk.org Tue Nov 4 07:09:47 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 4 Nov 2025 07:09:47 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> Message-ID: On Mon, 27 Oct 2025 09:17:02 GMT, Andrew Haley wrote: > > it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > > This. A function that returns its value as a side effect on a reference parameter is (at best) a code smell. Thanks for the comments. So what's the consensus then? As far as API surface is concerned I've modelled it after [JDK-8357086](https://bugs.openjdk.org/browse/JDK-8357086). It [introduces](https://github.com/openjdk/jdk/commit/d5d94db12a6d82a6fe9da18b5f8ce3733a6ee7e7) the side-effect/code smell issue. Do we want to re-open this discussion or proceed with this here. It's not clear to me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3469549344 From duke at openjdk.org Tue Nov 4 07:14:09 2025 From: duke at openjdk.org (Ruben) Date: Tue, 4 Nov 2025 07:14:09 GMT Subject: RFR: 8365799: AArch64: Remove trailing DMB from cmpxchgptr for LSE [v2] In-Reply-To: <1rP2ebDy2PFLeEq4lYTks0cHZsvBKJsZFPtVZPsbH_g=.8aa13828-9eb6-4607-821d-9bbc7bd286a1@github.com> References: <8nJUYG6FECEghybXRFfeIsA0R9paX_AFr5IgiGO6Trs=.a3f75a8f-421d-4da8-9c53-03064a397b2b@github.com> <1rP2ebDy2PFLeEq4lYTks0cHZsvBKJsZFPtVZPsbH_g=.8aa13828-9eb6-4607-821d-9bbc7bd286a1@github.com> Message-ID: On Thu, 21 Aug 2025 08:13:50 GMT, Andrew Haley wrote: >> You're right, it is a little inaccurate to use the Java API to determine what `cmpxchgptr` does. Although the original intent to remove the membar is still functionality correct. >> >> Although looking into it more, it seems that `cmpxchgptr` can now be removed completely. The recent PR https://github.com/openjdk/jdk/pull/26594 removed two of its call sites, and the only other existing one is within `MacroAssembler::cmpxchg_obj_header` - a method which never gets called to. >> >> So I propose: >> - We close this pr >> - Open new one which removes the methods `MacroAssembler::cmpxchg_obj_header`, `MacroAssembler::cmpxchgptr` and `MacroAssembler::cmpxchgw` completely since they are no longer called to from anywhere. > >> * We close this pr >> >> * Open new one which removes the methods `MacroAssembler::cmpxchg_obj_header`, `MacroAssembler::cmpxchgptr` and `MacroAssembler::cmpxchgw` completely since they are no longer called to from anywhere. > > If you like, although we can (and we should) tidy up as we go along. Hi @theRealAph, I've pushed changes to remove the unused cmpxchg* functions to https://github.com/openjdk/jdk/compare/master...ruben-arm:jdk:pr-8365799 as Samuel is currently not available. Once he is back, he can update this PR's branch. However, as this is a different scope from the original JBS ticket https://bugs.openjdk.org/browse/JDK-8365799, should we create a new one? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26845#issuecomment-3481358544 From ysuenaga at openjdk.org Tue Nov 4 07:41:13 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Tue, 4 Nov 2025 07:41:13 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM Message-ID: When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [linux-vdso.so.1+0xe69] [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] Retrying call stack printing without source information... [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] When I checked back trace on GDB, it failed at `assert`. #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 (gdb) f 13 #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 536 assert(false, "section header string table should be loaded"); vDSO is not a regular ELF, so it should be skipped here. ------------- Commit messages: - 8371093: Assert "section header string table should be loaded" failed on debug VM Changes: https://git.openjdk.org/jdk/pull/28102/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28102&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371093 Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28102.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28102/head:pull/28102 PR: https://git.openjdk.org/jdk/pull/28102 From duke at openjdk.org Tue Nov 4 08:34:34 2025 From: duke at openjdk.org (walkertest) Date: Tue, 4 Nov 2025 08:34:34 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v12] In-Reply-To: <-FOyxYPMnMfseoEVE0gqhuWFKT2s04ZLcKvyKF28zwE=.2c7eef23-1b9a-4339-89c6-58117a625848@github.com> References: <-FOyxYPMnMfseoEVE0gqhuWFKT2s04ZLcKvyKF28zwE=.2c7eef23-1b9a-4339-89c6-58117a625848@github.com> Message-ID: On Mon, 3 Nov 2025 18:51:52 GMT, Patricio Chilano Mateo wrote: > > A truly useful enhancement! Just had a few questions / suggestions. > > Thanks for the review Fred! Hello, I have meet a simlar question as: [https://stackoverflow.com/questions/79808508/jdk24-tomcat-start-pinned-in-virtual-thead-env](https://stackoverflow.com/questions/79808508/jdk24-tomcat-start-pinned-in-virtual-thead-env) I want to know if this quesiton is same as https://bugs.openjdk.org/browse/JDK-8369238 or not. How to temporarily solve this problem? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3484149341 From aph at openjdk.org Tue Nov 4 09:22:01 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 4 Nov 2025 09:22:01 GMT Subject: RFR: 8371188: [s390x] Un-ProblemList TestUnreachableInnerLoop.java In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 05:10:04 GMT, Amit Kumar wrote: > Trivial change, > After [JDK-8288981](https://bugs.openjdk.org/browse/JDK-8288981) delivery test is now passing on s390x. So It can be removed from the problemlist. Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28122#pullrequestreview-3415020771 From ayang at openjdk.org Tue Nov 4 09:26:36 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 4 Nov 2025 09:26:36 GMT Subject: RFR: 8370774: Merge ModRefBarrierSet into CardTableBarrierSet In-Reply-To: <4E4gACqFjlt5P5yJtCzgFFDk9GIamGuQriH4uoJX9Kc=.c9b53715-c582-4bef-8304-6d0e710cfcbd@github.com> References: <4E4gACqFjlt5P5yJtCzgFFDk9GIamGuQriH4uoJX9Kc=.c9b53715-c582-4bef-8304-6d0e710cfcbd@github.com> Message-ID: On Tue, 28 Oct 2025 09:20:46 GMT, Albert Mingkun Yang wrote: > Merge a class into its sole subclass. > > Many files are changed in this PR, and they can be largely divided into two groups, moving content of `ModRefBarrierSet` into `CardTableBarrierSet` (`src/hotspot/share/gc/`) and platform specific moving content of `ModRefBarrierSetAssembler` into `CardTableBarrierSetAssembler` (`src/hotspot/cpu/`). > > Test: tier1-5 for x64 and aarch64; GHA Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28013#issuecomment-3484837794 From ayang at openjdk.org Tue Nov 4 09:26:37 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 4 Nov 2025 09:26:37 GMT Subject: Integrated: 8370774: Merge ModRefBarrierSet into CardTableBarrierSet In-Reply-To: <4E4gACqFjlt5P5yJtCzgFFDk9GIamGuQriH4uoJX9Kc=.c9b53715-c582-4bef-8304-6d0e710cfcbd@github.com> References: <4E4gACqFjlt5P5yJtCzgFFDk9GIamGuQriH4uoJX9Kc=.c9b53715-c582-4bef-8304-6d0e710cfcbd@github.com> Message-ID: <6lsApuc2eXuLS2Thbiu_udBzifrewj5V5UQ2w62HUDE=.525ec5e3-b206-4049-9f0f-f20dd6549442@github.com> On Tue, 28 Oct 2025 09:20:46 GMT, Albert Mingkun Yang wrote: > Merge a class into its sole subclass. > > Many files are changed in this PR, and they can be largely divided into two groups, moving content of `ModRefBarrierSet` into `CardTableBarrierSet` (`src/hotspot/share/gc/`) and platform specific moving content of `ModRefBarrierSetAssembler` into `CardTableBarrierSetAssembler` (`src/hotspot/cpu/`). > > Test: tier1-5 for x64 and aarch64; GHA This pull request has now been integrated. Changeset: 21f41c5f Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/21f41c5f49cd3c5e6e4f29ed38701a4d92c16098 Stats: 2066 lines in 60 files changed: 633 ins; 1368 del; 65 mod 8370774: Merge ModRefBarrierSet into CardTableBarrierSet Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/28013 From aph at openjdk.org Tue Nov 4 09:32:54 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 4 Nov 2025 09:32:54 GMT Subject: RFR: 8365799: AArch64: Remove trailing DMB from cmpxchgptr for LSE [v2] In-Reply-To: <1rP2ebDy2PFLeEq4lYTks0cHZsvBKJsZFPtVZPsbH_g=.8aa13828-9eb6-4607-821d-9bbc7bd286a1@github.com> References: <8nJUYG6FECEghybXRFfeIsA0R9paX_AFr5IgiGO6Trs=.a3f75a8f-421d-4da8-9c53-03064a397b2b@github.com> <1rP2ebDy2PFLeEq4lYTks0cHZsvBKJsZFPtVZPsbH_g=.8aa13828-9eb6-4607-821d-9bbc7bd286a1@github.com> Message-ID: On Thu, 21 Aug 2025 08:13:50 GMT, Andrew Haley wrote: >> You're right, it is a little inaccurate to use the Java API to determine what `cmpxchgptr` does. Although the original intent to remove the membar is still functionality correct. >> >> Although looking into it more, it seems that `cmpxchgptr` can now be removed completely. The recent PR https://github.com/openjdk/jdk/pull/26594 removed two of its call sites, and the only other existing one is within `MacroAssembler::cmpxchg_obj_header` - a method which never gets called to. >> >> So I propose: >> - We close this pr >> - Open new one which removes the methods `MacroAssembler::cmpxchg_obj_header`, `MacroAssembler::cmpxchgptr` and `MacroAssembler::cmpxchgw` completely since they are no longer called to from anywhere. > >> * We close this pr >> >> * Open new one which removes the methods `MacroAssembler::cmpxchg_obj_header`, `MacroAssembler::cmpxchgptr` and `MacroAssembler::cmpxchgw` completely since they are no longer called to from anywhere. > > If you like, although we can (and we should) tidy up as we go along. > Hi @theRealAph, I've pushed changes to remove the unused cmpxchg* functions to [master...ruben-arm:jdk:pr-8365799](https://github.com/openjdk/jdk/compare/master...ruben-arm:jdk:pr-8365799) as Samuel is currently not available. Once he is back, he can update this PR's branch. However, as this is a different scope from the original JBS ticket https://bugs.openjdk.org/browse/JDK-8365799, should we create a new one? Yes, do that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26845#issuecomment-3484873265 From epeter at openjdk.org Tue Nov 4 09:42:02 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Nov 2025 09:42:02 GMT Subject: RFR: 8367341: C2: apply KnownBits and unsigned bounds to And / Or operations [v4] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 00:24:26 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR improves the implementation of `AndNode/OrNode/XorNode::Value` by taking advantages of the additional information in `TypeInt`. The implementation is pretty straightforward. A clever trick is that by analyzing the negative and positive ranges of a `TypeInt` separately, we have better info for the leading bits. I also implement gtest unit tests to verify the correctness and monotonicity of the inference functions. >> >> Please take a look and leave your reviews, thanks a lot. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > Add assertion for the helper in CTPComparator > > Co-authored-by: Emanuel Peter Still good :) ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27618#pullrequestreview-3415158637 From duke at openjdk.org Tue Nov 4 09:48:20 2025 From: duke at openjdk.org (Ruben) Date: Tue, 4 Nov 2025 09:48:20 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField - Merge from the main branch - Address review comments and fix a mistype - Check for NOP and MOVK separately in NativePostCallNop - Test for deoptimization in virtual threads Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a - Merge from the main branch - Address review comments - Address review comments - Address review comments - The patch is contributed by @TheRealMDoerr - ... and 5 more: https://git.openjdk.org/jdk/compare/1922c4fd...359c2f18 ------------- Changes: https://git.openjdk.org/jdk/pull/26678/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26678&range=09 Stats: 569 lines in 41 files changed: 268 ins; 216 del; 85 mod Patch: https://git.openjdk.org/jdk/pull/26678.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26678/head:pull/26678 PR: https://git.openjdk.org/jdk/pull/26678 From mdoerr at openjdk.org Tue Nov 4 09:58:54 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 4 Nov 2025 09:58:54 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Tue, 4 Nov 2025 09:48:20 GMT, Ruben wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField > - Merge from the main branch > - Address review comments and fix a mistype > - Check for NOP and MOVK separately in NativePostCallNop > - Test for deoptimization in virtual threads > > Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a > - Merge from the main branch > - Address review comments > - Address review comments > - Address review comments > - The patch is contributed by @TheRealMDoerr > - ... and 5 more: https://git.openjdk.org/jdk/compare/1922c4fd...359c2f18 Marked as reviewed by mdoerr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26678#pullrequestreview-3415283652 From qamai at openjdk.org Tue Nov 4 10:09:07 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Tue, 4 Nov 2025 10:09:07 GMT Subject: RFR: 8367341: C2: apply KnownBits and unsigned bounds to And / Or operations [v5] In-Reply-To: References: Message-ID: > Hi, > > This PR improves the implementation of `AndNode/OrNode/XorNode::Value` by taking advantages of the additional information in `TypeInt`. The implementation is pretty straightforward. A clever trick is that by analyzing the negative and positive ranges of a `TypeInt` separately, we have better info for the leading bits. I also implement gtest unit tests to verify the correctness and monotonicity of the inference functions. > > Please take a look and leave your reviews, thanks a lot. Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge branch 'master' into andorxor - Add assertion for the helper in CTPComparator Co-authored-by: Emanuel Peter - remove std::hash - remove unordered_map, add some comments for all_instances_size - Emanuel's reviews - Improve Value inferences of And, Or, Xor and implement gtest for general Value inferences ------------- Changes: https://git.openjdk.org/jdk/pull/27618/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27618&range=04 Stats: 964 lines in 9 files changed: 630 ins; 313 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/27618.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27618/head:pull/27618 PR: https://git.openjdk.org/jdk/pull/27618 From jsjolen at openjdk.org Tue Nov 4 10:57:47 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Nov 2025 10:57:47 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v12] In-Reply-To: <_0HzhdWbRBZNJvB33qf8VXRnc70eYXm7NCmb6oSEllw=.482f6b91-c612-4be7-a007-29954f0f5080@github.com> References: <_0HzhdWbRBZNJvB33qf8VXRnc70eYXm7NCmb6oSEllw=.482f6b91-c612-4be7-a007-29954f0f5080@github.com> Message-ID: On Wed, 8 Oct 2025 20:35:41 GMT, Serguei Spitsyn wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright > > src/hotspot/share/classfile/classFileParser.cpp line 3342: > >> 3340: >> 3341: BSMAttributeEntry* entry = iter.reserve_new_entry(bootstrap_method_ref, num_bootstrap_arguments); >> 3342: guarantee_property(entry != nullptr, "Invalid BootstrapMethods num_bootstrap_methods. The total amount of space reserved for the BootstrapMethod attribute was not sufficient", CHECK); > > Nit: This line is too big. It is a good idea to split the message. Also, would it better to move this guaranty for `nullptr` into the `reserve_new_entry()`? I think it's best to keep this null check here so that we can have a specialized error message. In this case, we are reading a classfile, so we're parsing untrusted input. That's when we might get a null result when attempting to reserve a new entry. In the other cases, we're working from trusted data and a presumed-to-be-correct algorithm. I've added assertions in these cases, as it shouldn't break. > src/hotspot/share/oops/bsmAttribute.hpp line 97: > >> 95: int _cur_offset; >> 96: // Current unused offset into BSMAEs bsm-data array >> 97: int _cur_array; > > Nit: The declarations at lines 95, 97 will be more readable if comments above declarations trail declarations. The comment should not start with capital. Do you mean that you want this? ```c++ int _cur_offset; // current unused offset into BSMAEs offset array ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2489955815 PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2489948444 From jsjolen at openjdk.org Tue Nov 4 10:57:50 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Nov 2025 10:57:50 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v7] In-Reply-To: References: Message-ID: On Tue, 23 Sep 2025 04:22:49 GMT, Serguei Spitsyn wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/operands-again' into operands-again >> - Fix BSM naming > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstantPool.java line 482: > >> 480: int basePos = offs.at(bsmIndex); >> 481: int argv = basePos + INDY_ARGV_OFFSET; >> 482: int argc = getBootstrapMethodArgsCount(bsmIndex); > > Nit: Consider to make it shorter: > `getBootstrapMethods` => `getBSMs` > `getBootstrapMethodArgsCount` => `getBSMArgsCount` I'm keeping it verbose, as that's the general style of this file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2489941288 From jsjolen at openjdk.org Tue Nov 4 10:57:44 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Nov 2025 10:57:44 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v14] In-Reply-To: References: Message-ID: > Hi, > > This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. > > We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. > > For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. > > On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. > > Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Serguei comments - Revert change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27198/files - new: https://git.openjdk.org/jdk/pull/27198/files/5e72da4e..e3419823 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=12-13 Stats: 6 lines in 4 files changed: 4 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27198.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27198/head:pull/27198 PR: https://git.openjdk.org/jdk/pull/27198 From jsjolen at openjdk.org Tue Nov 4 10:57:48 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Nov 2025 10:57:48 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v12] In-Reply-To: References: <_0HzhdWbRBZNJvB33qf8VXRnc70eYXm7NCmb6oSEllw=.482f6b91-c612-4be7-a007-29954f0f5080@github.com> Message-ID: On Wed, 15 Oct 2025 17:58:41 GMT, Serguei Spitsyn wrote: >> I believe that the style is to have an empty line between the code and the endif, so this is a style fix. > > The issue is this is the only fix in this file. Should we go and fix all style issues around? > In fact, I'm okay with this tweak. :) Ah, you're right. I responded to this in the 'Conversation' view and didn't see that this is the only change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2489892804 From phubner at openjdk.org Tue Nov 4 10:59:07 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Tue, 4 Nov 2025 10:59:07 GMT Subject: RFR: 8371188: [s390x] Un-ProblemList TestUnreachableInnerLoop.java In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 05:10:04 GMT, Amit Kumar wrote: > Trivial change, > After [JDK-8288981](https://bugs.openjdk.org/browse/JDK-8288981) delivery test is now passing on s390x. So It can be removed from the problemlist. Thanks! ------------- Marked as reviewed by phubner (Author). PR Review: https://git.openjdk.org/jdk/pull/28122#pullrequestreview-3415707571 From phubner at openjdk.org Tue Nov 4 11:02:28 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Tue, 4 Nov 2025 11:02:28 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM In-Reply-To: References: Message-ID: On Sun, 2 Nov 2025 06:27:50 GMT, Yasumasa Suenaga wrote: > When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [linux-vdso.so.1+0xe69] > [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] > > Retrying call stack printing without source information... > > [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] > > > When I checked back trace on GDB, it failed at `assert`. > > #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", > line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", > detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 > > > > (gdb) f 13 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > 536 assert(false, "section header string table should be loaded"); > > > vDSO is not a regular ELF, so it should be skipped here. >From the manpage: > The name of the vDSO varies across architectures. It will often show up in things like glibc's [ldd(1)](https://man7.org/linux/man-pages/man1/ldd.1.html) output. The exact name should not matter to any code, so do not hardcode it. Just throwing an idea out there, since we are hardcoding the name, would it be worth to add an `else` clause to your `if`, in which we `assert` that `filepath` does not contain `vdso` or `linux-gate.so`? ------------- PR Review: https://git.openjdk.org/jdk/pull/28102#pullrequestreview-3415738465 From duke at openjdk.org Tue Nov 4 11:35:04 2025 From: duke at openjdk.org (Ruben) Date: Tue, 4 Nov 2025 11:35:04 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Tue, 4 Nov 2025 09:48:20 GMT, Ruben wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField > - Merge from the main branch > - Address review comments and fix a mistype > - Check for NOP and MOVK separately in NativePostCallNop > - Test for deoptimization in virtual threads > > Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a > - Merge from the main branch > - Address review comments > - Address review comments > - Address review comments > - The patch is contributed by @TheRealMDoerr > - ... and 5 more: https://git.openjdk.org/jdk/compare/1922c4fd...359c2f18 Thank you all for reviewing the PR and helping with testing. A separate JBS issue has been opened for `SafeFetch`: https://bugs.openjdk.org/browse/JDK-8371204. I plan to wait until tomorrow before issuing the `/integrate` request. Please let me know if you think this should wait longer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3485520456 From ysuenaga at openjdk.org Tue Nov 4 12:21:02 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Tue, 4 Nov 2025 12:21:02 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM In-Reply-To: References: Message-ID: <9q6sEvrBzhWLGSnGHWkfZrKcfFea-mQ3MZknTI4-RAU=.16b44786-4e12-4175-8963-196315f7cfba@github.com> On Tue, 4 Nov 2025 11:00:03 GMT, Paul H?bner wrote: > would it be worth to add an else clause to your if, in which we assert that filepath does not contain vdso or linux-gate.so? It would be false-positive if "vdso" is included unintended path like "libfoo-vdsoon.so" . `linux-gate.so` is for unsupported platform of OpenJDK e.g. IA64, SH. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28102#issuecomment-3485695366 From eosterlund at openjdk.org Tue Nov 4 12:23:06 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 4 Nov 2025 12:23:06 GMT Subject: RFR: 8367319: Add os interfaces to get machine and container values separately [v2] In-Reply-To: References: Message-ID: On Mon, 6 Oct 2025 14:48:29 GMT, Casper Norrbin wrote: >> Hi everyone, >> >> The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples: >> >> - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different. >> - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number. >> >> To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values. >> >> In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment. >> >> `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`. >> >> Testing: >> - Oracle tiers 1-5 >> - Container tests on cgroup v1 and v2 hosts. > > Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: > > Fixed print type Looks good to me. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27646#pullrequestreview-3416160902 From phubner at openjdk.org Tue Nov 4 12:50:36 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Tue, 4 Nov 2025 12:50:36 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM In-Reply-To: <9q6sEvrBzhWLGSnGHWkfZrKcfFea-mQ3MZknTI4-RAU=.16b44786-4e12-4175-8963-196315f7cfba@github.com> References: <9q6sEvrBzhWLGSnGHWkfZrKcfFea-mQ3MZknTI4-RAU=.16b44786-4e12-4175-8963-196315f7cfba@github.com> Message-ID: On Tue, 4 Nov 2025 12:18:16 GMT, Yasumasa Suenaga wrote: > > would it be worth to add an else clause to your if, in which we assert that filepath does not contain vdso or linux-gate.so? > > It would be false-positive if "vdso" is included unintended path like "libfoo-vdsoon.so" . `linux-gate.so` is for unsupported platform of OpenJDK e.g. IA64, SH. Yes, it would be a false positive. If I understand correctly, right now we error due to an assertion. So after your patch, if someone (for whatever reason) has a different file name, we would error anyway in the same way we error now, via the same assertion right? If we error anyway, feel like it could be marginally more helpful to make the error location/message as precise as possible. That said, I don't know this area nor vDSO very well, so I'm not sure if my proposal is very canonical. I'm just brainstorming from the perspective of someone who wants to make debugging as straightforward as possible. I'm happy to approve the change as-is. What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28102#issuecomment-3485834347 From shade at openjdk.org Tue Nov 4 13:15:22 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 4 Nov 2025 13:15:22 GMT Subject: RFR: 8334866: Improve Speed of ElfDecoder source search [v8] In-Reply-To: References: Message-ID: On Thu, 30 Oct 2025 11:46:11 GMT, Kerem Kat wrote: >> Right now, looking up source file and line number info is slow because we do a full linear scan of the `.debug_aranges` section for every single call. This can be a major bottleneck on large binaries, especially during frequent native stack walking, e.g. while writing an hs_err. >> >> This change fixes that by caching the address ranges on the first lookup, and keeping it in memory for the lifetime of the `DwarfFile` object. >> >> All subsequent lookups on that object now use a binary search instead of the slow linear scan. If caching fails for any reason, it just falls back to the old method. > > Kerem Kat has updated the pull request incrementally with three additional commits since the last revision: > > - Revert "Erase obsolete comments" > > This reverts commit 860b6ee6faeb6e56e292abef1c85faad456729e2. > - fix space > - Erase obsolete comments Looks reasonable. A few questions/nits: src/hotspot/share/utilities/elfFile.cpp line 730: > 728: // Assume ~3% of the .debug_aranges is DebugArangesSetHeader and the rest is made up of AddressDescriptors. > 729: const uintptr_t estimatedSetHeaderSize = _size_bytes / 32; > 730: const size_t initial_capacity = (_size_bytes - estimatedSetHeaderSize) / sizeof(AddressDescriptor); Suggestion: const uintptr_t estimated_set_header_size = _size_bytes / 32; const size_t initial_capacity = (_size_bytes - estimated_set_header_size) / sizeof(AddressDescriptor); src/hotspot/share/utilities/elfFile.cpp line 931: > 929: bool DwarfFile::ArangesCache::add_entry(const AddressDescriptor& descriptor, uint32_t debug_info_offset) { > 930: if (_count >= _capacity && !grow()) { > 931: destroy(true); Why it calls `destroy` here? Can you call destroy multiple times? I wonder if it would be cleaner to move `destroy` to caller, seeing how it already handles the similar failure. ------------- PR Review: https://git.openjdk.org/jdk/pull/27337#pullrequestreview-3416394946 PR Review Comment: https://git.openjdk.org/jdk/pull/27337#discussion_r2490465475 PR Review Comment: https://git.openjdk.org/jdk/pull/27337#discussion_r2490464716 From egahlin at openjdk.org Tue Nov 4 13:29:59 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 4 Nov 2025 13:29:59 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v4] In-Reply-To: References: Message-ID: On Wed, 29 Oct 2025 21:07:51 GMT, Mat Carter wrote: >> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. >> >> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE >> >> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: >> >> TRUE >> FALSE >> >> Passes tier1 on linux (x64) and windows (x64) > > Mat Carter has updated the pull request incrementally with one additional commit since the last revision: > > Updated test based on comments > To answer your first question, we do have a diagnostic command (AOT.end_recording) and it would precede the AOT MXBean into mainline and its PR is here: #27965 Ah, I didn't know that. (I don?t have a strong opinion on this, but for consistency you might want the jcmd commands for AOT recording to use the same naming convention for starting, stopping, and checking the status of a recording as JFR: JFR.start, JFR.stop, JFR.check, JFR.dump and JFR.configure, where applicable. Initially, we had start_recording, but we later shortened it to start to avoid unnecessary typing) > The longer goal for this MXBean is to provide additional methods that would aid in monitoring (isRecording, currentRecordingLength etc.), however we decided to reduce the scope of the MXBean for main line while we continue to test the monitoring functionality in leyden/premain > > Historically the diagnostic command came after the MXBean in leyden/premain, however I decided to implement the diagnostic command with the necessary JVM hooks first to simplify review Ok. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28010#issuecomment-3486004039 From coleenp at openjdk.org Tue Nov 4 13:44:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 4 Nov 2025 13:44:54 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v7] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 10:49:31 GMT, Johan Sj?len wrote: >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstantPool.java line 482: >> >>> 480: int basePos = offs.at(bsmIndex); >>> 481: int argv = basePos + INDY_ARGV_OFFSET; >>> 482: int argc = getBootstrapMethodArgsCount(bsmIndex); >> >> Nit: Consider to make it shorter: >> `getBootstrapMethods` => `getBSMs` >> `getBootstrapMethodArgsCount` => `getBSMArgsCount` > > I'm keeping it verbose, as that's the general style of this file. Yes please I like the full names if possible rather than abbreviations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2490567923 From duke at openjdk.org Tue Nov 4 13:48:14 2025 From: duke at openjdk.org (Ruben) Date: Tue, 4 Nov 2025 13:48:14 GMT Subject: RFR: 8365799: AArch64: Remove trailing DMB from cmpxchgptr for LSE [v2] In-Reply-To: References: <8nJUYG6FECEghybXRFfeIsA0R9paX_AFr5IgiGO6Trs=.a3f75a8f-421d-4da8-9c53-03064a397b2b@github.com> <1rP2ebDy2PFLeEq4lYTks0cHZsvBKJsZFPtVZPsbH_g=.8aa13828-9eb6-4607-821d-9bbc7bd286a1@github.com> Message-ID: <2m6c5VL2EUrTuA3fqHTKxMa9CRPYSKsjvmxHzCcdFoM=.830507c6-be6b-4c4b-8a9e-41bfb38d5741@github.com> On Tue, 4 Nov 2025 09:30:26 GMT, Andrew Haley wrote: >Yes, do that. Opened the ticket and corresponding PR: https://bugs.openjdk.org/browse/JDK-8371205 https://github.com/openjdk/jdk/pull/28131 ------------- PR Comment: https://git.openjdk.org/jdk/pull/26845#issuecomment-3486079877 From fbredberg at openjdk.org Tue Nov 4 13:49:23 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 4 Nov 2025 13:49:23 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 00:04:11 GMT, David Holmes wrote: >> Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: >> >> Update after review > > src/hotspot/share/runtime/abstract_vm_version.hpp line 195: > >> 193: >> 194: // Is recursive fast locking implemented for this platform? >> 195: constexpr static bool supports_recursive_fast_locking() { return false; } > > Next cleanup: this is supported on all platforms now, so we can get rid of this migration aid. Not sure we can do that, since I don't find any implementation of recursive fast locking on ARM32. @bulasevich Any comment on this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27915#discussion_r2490583182 From ysuenaga at openjdk.org Tue Nov 4 13:51:29 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Tue, 4 Nov 2025 13:51:29 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM In-Reply-To: References: Message-ID: On Sun, 2 Nov 2025 06:27:50 GMT, Yasumasa Suenaga wrote: > When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [linux-vdso.so.1+0xe69] > [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] > > Retrying call stack printing without source information... > > [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] > > > When I checked back trace on GDB, it failed at `assert`. > > #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", > line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", > detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 > > > > (gdb) f 13 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > 536 assert(false, "section header string table should be loaded"); > > > vDSO is not a regular ELF, so it should be skipped here. vDSO is very special, it is a part of kernel, we can't find the .so file on lib directory (e.g. `/usr/lib64`). The assertion mentioned this issue is caused by this behavior - `ElfDecorder` attemps to load souce information from ELF, but it wouldn't because vDSO file is not found. It should happen on vDSO because regular loaded ELFs should be located on file system (otherwise the process wouldn't load library of course). vDSO name is specified in manpage, so we can skip vDSO only. Thus I think it does not need to add `else` for other files named with `vdso`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28102#issuecomment-3486093770 From duke at openjdk.org Tue Nov 4 13:52:01 2025 From: duke at openjdk.org (Ruben) Date: Tue, 4 Nov 2025 13:52:01 GMT Subject: RFR: 8371205: AArch64: Remove unused cmpxchg* methods Message-ID: Since JDK-8364406, the AArch64 macroAssembler method cmpxchg_obj_header is no longer used. The method cmpxchgptr is used by cmpxchg_obj_header however is not used by any other method so can be removed alongside cmpxchgptr. cmpxchgw is also unused and can be removed. ------------- Commit messages: - 8371205: AArch64: Remove unused cmpxchg* methods Changes: https://git.openjdk.org/jdk/pull/28131/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28131&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371205 Stats: 101 lines in 2 files changed: 0 ins; 101 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28131.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28131/head:pull/28131 PR: https://git.openjdk.org/jdk/pull/28131 From jsjolen at openjdk.org Tue Nov 4 14:27:57 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Nov 2025 14:27:57 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v15] In-Reply-To: References: Message-ID: <-MyLNt-M1uWjNPTzmizfSNSHZjqO-MFkxJK_Dn22MQs=.68e3b8cf-10da-4c8d-a891-11a0ecf516be@github.com> > Hi, > > This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. > > We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. > > For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. > > On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. > > Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - It's fine to initialize the iterator with null, it's not fine to reserve an entry if it's null - Fix naming ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27198/files - new: https://git.openjdk.org/jdk/pull/27198/files/e3419823..219ef346 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=13-14 Stats: 16 lines in 3 files changed: 3 ins; 3 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/27198.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27198/head:pull/27198 PR: https://git.openjdk.org/jdk/pull/27198 From krk at openjdk.org Tue Nov 4 14:47:18 2025 From: krk at openjdk.org (Kerem Kat) Date: Tue, 4 Nov 2025 14:47:18 GMT Subject: RFR: 8334866: Improve Speed of ElfDecoder source search [v9] In-Reply-To: References: Message-ID: > Right now, looking up source file and line number info is slow because we do a full linear scan of the `.debug_aranges` section for every single call. This can be a major bottleneck on large binaries, especially during frequent native stack walking, e.g. while writing an hs_err. > > This change fixes that by caching the address ranges on the first lookup, and keeping it in memory for the lifetime of the `DwarfFile` object. > > All subsequent lookups on that object now use a binary search instead of the slow linear scan. If caching fails for any reason, it just falls back to the old method. Kerem Kat has updated the pull request incrementally with one additional commit since the last revision: move destroy and rename var ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27337/files - new: https://git.openjdk.org/jdk/pull/27337/files/d94025da..ab20ce3b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27337&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27337&range=07-08 Stats: 4 lines in 1 file changed: 1 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/27337.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27337/head:pull/27337 PR: https://git.openjdk.org/jdk/pull/27337 From krk at openjdk.org Tue Nov 4 14:47:21 2025 From: krk at openjdk.org (Kerem Kat) Date: Tue, 4 Nov 2025 14:47:21 GMT Subject: RFR: 8334866: Improve Speed of ElfDecoder source search [v8] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 13:09:16 GMT, Aleksey Shipilev wrote: >> Kerem Kat has updated the pull request incrementally with three additional commits since the last revision: >> >> - Revert "Erase obsolete comments" >> >> This reverts commit 860b6ee6faeb6e56e292abef1c85faad456729e2. >> - fix space >> - Erase obsolete comments > > src/hotspot/share/utilities/elfFile.cpp line 931: > >> 929: bool DwarfFile::ArangesCache::add_entry(const AddressDescriptor& descriptor, uint32_t debug_info_offset) { >> 930: if (_count >= _capacity && !grow()) { >> 931: destroy(true); > > Why it calls `destroy` here? Can you call destroy multiple times? I wonder if it would be cleaner to move `destroy` to caller, seeing how it already handles the similar failure. If the cache cannot grow, it becomes unusable as we will not use a partial cache. So we destroy the cache and fall back to the linear scan for the dwarf file. I am moving the `destroy` call into `ensure_cached` as all other destroy calls are in that method, thanks. Destroying a cache multiple times is not an error, as the heap pointer in `ArangesCache::free` is checked for null. On the other hand, when destroy is called, it is followed by a return in `DebugAranges::ensure_cached`. Did you have a particular scenario when `destroy` is called multiple times? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27337#discussion_r2490786974 From jvernee at openjdk.org Tue Nov 4 15:43:47 2025 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 4 Nov 2025 15:43:47 GMT Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access [v6] In-Reply-To: References: Message-ID: <-y-NiC9PnzyWtkppcs3ffnYGeWucnYSOqrtWOChOFNs=.904d1e0f-dfb0-4427-af71-c2fad0355aba@github.com> On Mon, 3 Nov 2025 18:29:43 GMT, Jorn Vernee wrote: >> See the JBS issue for a problem description. >> >> This patch changes the shared scope closure handshake to be able to handle 'arbitrary' Java frames on the stack during a scoped memory access. >> >> For the purposes of this change, we assume that 'arbitrary' is limited to the following: >> 1. Frames added by calling the constructor of `InternalError` as a result of a faulting access to a truncated memory-mapped file (see `HandshakeState::handle_unsafe_access_error`). This is the only handshake operation (i.e. may be triggered during a scoped access) that calls back into Java. >> 2. Frames added by a JVMTI agent that calls back into Java code while handling a JVMTI event that happens during a scoped access. >> 3. Any other Java code that runs as part of the linking process. >> >> For (1), we set a flag while we are create the `InternalError`. If a thread has that flag set, we know it is in the process of crashing already, so we don't have to inspect the stack at all. For (2), all bets are off, so we have to walk the entire stack. For (3), this patch switches the hard limit of 10 frames for the stack walk to instead bail out at the first frame outside of the `java.base` module. In most cases this speeds up the stack walk as well, if threads are running other code. >> >> The test `TestSharedCloseJFR` is added for scenario (1), and the test `TestSharedCloseJvmti` is added for scenario (2). Existing tests already cover scenario (3). >> >> Testing: tier 1-4 > > Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: > > Review comments Patricio Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27919#issuecomment-3486638747 From jvernee at openjdk.org Tue Nov 4 15:43:50 2025 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 4 Nov 2025 15:43:50 GMT Subject: Integrated: 8370344: Arbitrary Java frames on stack during scoped access In-Reply-To: References: Message-ID: On Tue, 21 Oct 2025 16:00:49 GMT, Jorn Vernee wrote: > See the JBS issue for a problem description. > > This patch changes the shared scope closure handshake to be able to handle 'arbitrary' Java frames on the stack during a scoped memory access. > > For the purposes of this change, we assume that 'arbitrary' is limited to the following: > 1. Frames added by calling the constructor of `InternalError` as a result of a faulting access to a truncated memory-mapped file (see `HandshakeState::handle_unsafe_access_error`). This is the only handshake operation (i.e. may be triggered during a scoped access) that calls back into Java. > 2. Frames added by a JVMTI agent that calls back into Java code while handling a JVMTI event that happens during a scoped access. > 3. Any other Java code that runs as part of the linking process. > > For (1), we set a flag while we are create the `InternalError`. If a thread has that flag set, we know it is in the process of crashing already, so we don't have to inspect the stack at all. For (2), all bets are off, so we have to walk the entire stack. For (3), this patch switches the hard limit of 10 frames for the stack walk to instead bail out at the first frame outside of the `java.base` module. In most cases this speeds up the stack walk as well, if threads are running other code. > > The test `TestSharedCloseJFR` is added for scenario (1), and the test `TestSharedCloseJvmti` is added for scenario (2). Existing tests already cover scenario (3). > > Testing: tier 1-4 This pull request has now been integrated. Changeset: a51a0bf5 Author: Jorn Vernee URL: https://git.openjdk.org/jdk/commit/a51a0bf57feaae0862fd7f3dbf305883d49781a0 Stats: 550 lines in 9 files changed: 534 ins; 5 del; 11 mod 8370344: Arbitrary Java frames on stack during scoped access Reviewed-by: pchilanomate, dholmes, liach ------------- PR: https://git.openjdk.org/jdk/pull/27919 From pchilanomate at openjdk.org Tue Nov 4 15:47:31 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 15:47:31 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: > If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. > > As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. > > This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. > > ### Summary of implementation > > The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. > > If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). > > ### Notes > > `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: fix to JvmtiHideSingleStepping ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27802/files - new: https://git.openjdk.org/jdk/pull/27802/files/4dff05a8..55c89ad0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=12-13 Stats: 182 lines in 4 files changed: 164 ins; 5 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/27802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27802/head:pull/27802 PR: https://git.openjdk.org/jdk/pull/27802 From pchilanomate at openjdk.org Tue Nov 4 15:47:33 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 15:47:33 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v13] In-Reply-To: References: Message-ID: <1m9Kbr2qq1hRl4Sc4YG39hxJ0WFS5aAx-BDmiAaZ_Xk=.d9ba0f21-f06d-46b3-8b08-cf5cc3520906@github.com> On Mon, 3 Nov 2025 19:03:07 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Suggested fix in macroAssembler_ppc.cpp > Hello, I have meet a simlar question as: https://stackoverflow.com/questions/79808508/jdk24-tomcat-start-pinned-in-virtual-thead-env > > I want to know if this quesiton is same as https://bugs.openjdk.org/browse/JDK-8369238 or not. How to temporarily solve this problem? > In the stacktrace posted, virtual thread #157 is pinned not because of the `static synchronized` but because there are native frames in the stack due to initializing class `CoyoteOutputStream`. This PR doesn?t address that pinning case. It addresses the case of virtual threads pinned waiting for a class to be initialized by another thread. Feel free to send any related questions to the loom-dev mailing list instead, it?s the best place to discuss this. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3486642618 From pchilanomate at openjdk.org Tue Nov 4 15:48:00 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 15:48:00 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v13] In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 19:03:07 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Suggested fix in macroAssembler_ppc.cpp I pushed a small fix to `JvmtiHideSingleStepping` for an issue found during pre-integration testing where we hit this assert: https://github.com/openjdk/jdk/blob/50bb92a33b32778a96b1823ff995889892bef890/src/hotspot/share/prims/jvmtiThreadState.hpp#L337 The problem is that for a preempting vthread, the `JvmtiThreadState` of the current `JavaThread` has already been rebinded to the state of the carrier when executing `~JvmtiHideSingleStepping`. The fix is to remember the `JvmtiThreadState` used originally in the `JvmtiHideSingleStepping` constructor. The commit includes a new test that reproduces the issue. @sspitsyn maybe you could take a look at this please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3486666711 From duke at openjdk.org Tue Nov 4 16:31:05 2025 From: duke at openjdk.org (Nityanand Rai) Date: Tue, 4 Nov 2025 16:31:05 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v2] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: <1azA71s8sFRamYweW9fbMEv1Drl2dNVUT7qZmUPeQeU=.c9bbf5eb-fede-44ae-b767-549a74a5dd75@github.com> > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'openjdk:master' into macos-vm-tagging - 1. Test improved to incorporate os/zgc allocation routines 2. Minor refactor to improve readability - remove extra whitespace - remove extra whitespace - Merge branch 'openjdk:master' into macos-vm-tagging - Covered ZGC direct mmap usage for JAVA tagging. Added unit test to validate changes - Merge branch 'openjdk:master' into macos-vm-tagging - Merge branch 'openjdk:master' into macos-vm-tagging - Merge branch 'openjdk:master' into macos-vm-tagging - Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/1144ca92..1c8b3818 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=00-01 Stats: 73212 lines in 988 files changed: 39905 ins; 27282 del; 6025 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Tue Nov 4 16:46:28 2025 From: duke at openjdk.org (Nityanand Rai) Date: Tue, 4 Nov 2025 16:46:28 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v3] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: clean whitespaces ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/1c8b3818..72a6e5fb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=01-02 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From vpaprotski at openjdk.org Tue Nov 4 16:46:36 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Tue, 4 Nov 2025 16:46:36 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements Message-ID: - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline - `SignatureBench.MLDSA` is 1.2x-2.2x faster - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version - `SignatureBench.MLDSA` is upto 5% faster, never slower Note on intrinsic: - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 Tests and benchmarks: - Added a fuzz test to ensure Java and intrinsic produces exactly same result - Added benchmark to measure the performance of intrinsic itself make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" ------------- Commit messages: - Merge remote-tracking branch 'origin/master' into avx2-ntt - add copyright, whitespace and test jtreg tags - Fixes and comments from Anas - AVX2 and AVX512 intrinsics for MLDSA Changes: https://git.openjdk.org/jdk/pull/28136/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371259 Stats: 2247 lines in 7 files changed: 1546 ins; 257 del; 444 mod Patch: https://git.openjdk.org/jdk/pull/28136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28136/head:pull/28136 PR: https://git.openjdk.org/jdk/pull/28136 From stefank at openjdk.org Tue Nov 4 17:06:52 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 4 Nov 2025 17:06:52 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v3] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Tue, 4 Nov 2025 16:46:28 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > clean whitespaces EOD here but got notified about this change. A few quick comments: 1) Could you fix the whitespace issues GH is complaining about? 2) There's a significant amount of duplication of these conspicuous additions. Could you you create a constexpr function and use that instead? 3) It doesn't seem prudent to put the new testing code in our test_zForwarding.cpp file. Could this be in its own test file? ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27868#pullrequestreview-3417654140 From duke at openjdk.org Tue Nov 4 17:12:15 2025 From: duke at openjdk.org (Nityanand Rai) Date: Tue, 4 Nov 2025 17:12:15 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v4] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: more whitespace cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/72a6e5fb..ab1a90d0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=02-03 Stats: 17 lines in 3 files changed: 0 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From shade at openjdk.org Tue Nov 4 17:17:47 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 4 Nov 2025 17:17:47 GMT Subject: RFR: 8334866: Improve Speed of ElfDecoder source search [v9] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 14:47:18 GMT, Kerem Kat wrote: >> Right now, looking up source file and line number info is slow because we do a full linear scan of the `.debug_aranges` section for every single call. This can be a major bottleneck on large binaries, especially during frequent native stack walking, e.g. while writing an hs_err. >> >> This change fixes that by caching the address ranges on the first lookup, and keeping it in memory for the lifetime of the `DwarfFile` object. >> >> All subsequent lookups on that object now use a binary search instead of the slow linear scan. If caching fails for any reason, it just falls back to the old method. > > Kerem Kat has updated the pull request incrementally with one additional commit since the last revision: > > move destroy and rename var Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27337#pullrequestreview-3417718734 From shade at openjdk.org Tue Nov 4 17:17:49 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 4 Nov 2025 17:17:49 GMT Subject: RFR: 8334866: Improve Speed of ElfDecoder source search [v8] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 14:41:10 GMT, Kerem Kat wrote: > Did you have a particular scenario when destroy is called multiple times? No, just wondering. It was just suspicious that `add` manages global cache state (destroying it) somehow. With multiple `add`-s, one could have suspected it could destroy the cache multiple times. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27337#discussion_r2491385986 From duke at openjdk.org Tue Nov 4 17:42:25 2025 From: duke at openjdk.org (duke) Date: Tue, 4 Nov 2025 17:42:25 GMT Subject: RFR: 8334866: Improve Speed of ElfDecoder source search [v9] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 14:47:18 GMT, Kerem Kat wrote: >> Right now, looking up source file and line number info is slow because we do a full linear scan of the `.debug_aranges` section for every single call. This can be a major bottleneck on large binaries, especially during frequent native stack walking, e.g. while writing an hs_err. >> >> This change fixes that by caching the address ranges on the first lookup, and keeping it in memory for the lifetime of the `DwarfFile` object. >> >> All subsequent lookups on that object now use a binary search instead of the slow linear scan. If caching fails for any reason, it just falls back to the old method. > > Kerem Kat has updated the pull request incrementally with one additional commit since the last revision: > > move destroy and rename var @krk Your change (at version ab20ce3b755f64d069f9f1f06e10bdcb7e42ca6b) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27337#issuecomment-3487285662 From aph at openjdk.org Tue Nov 4 17:49:58 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 4 Nov 2025 17:49:58 GMT Subject: RFR: 8371205: AArch64: Remove unused cmpxchg* methods In-Reply-To: References: Message-ID: <0xHF1oB8jYPfooZZqjxPT7i6VTAt6o0KN2FVpnUhUyo=.21fbcc0d-fc23-42d1-99e3-0f35d7760b63@github.com> On Tue, 4 Nov 2025 13:41:04 GMT, Ruben wrote: > Since JDK-8364406, the AArch64 macroAssembler method cmpxchg_obj_header is no longer used. The method cmpxchgptr is used by cmpxchg_obj_header however is not used by any other method so can be removed alongside cmpxchgptr. > cmpxchgw is also unused and can be removed. Trivial, obviously correct. Marked as reviewed by aph (Reviewer). ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28131#pullrequestreview-3417917816 PR Review: https://git.openjdk.org/jdk/pull/28131#pullrequestreview-3417923671 From kbarrett at openjdk.org Tue Nov 4 17:49:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 4 Nov 2025 17:49:59 GMT Subject: RFR: 8371205: AArch64: Remove unused cmpxchg* methods In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 13:41:04 GMT, Ruben wrote: > Since JDK-8364406, the AArch64 macroAssembler method cmpxchg_obj_header is no longer used. The method cmpxchgptr is used by cmpxchg_obj_header however is not used by any other method so can be removed alongside cmpxchgptr. > cmpxchgw is also unused and can be removed. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28131#pullrequestreview-3417927215 From duke at openjdk.org Tue Nov 4 18:24:01 2025 From: duke at openjdk.org (Nityanand Rai) Date: Tue, 4 Nov 2025 18:24:01 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v5] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: move is_memory_tagged_as_java to common to reduce duplication ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/ab1a90d0..2322badf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=03-04 Stats: 122 lines in 4 files changed: 43 ins; 74 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Tue Nov 4 19:07:32 2025 From: duke at openjdk.org (Nityanand Rai) Date: Tue, 4 Nov 2025 19:07:32 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v6] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: minor refactoring to reduce code duplication ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/2322badf..183927b0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=04-05 Stats: 42 lines in 3 files changed: 12 ins; 24 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Tue Nov 4 19:07:33 2025 From: duke at openjdk.org (Nityanand Rai) Date: Tue, 4 Nov 2025 19:07:33 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v3] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Tue, 4 Nov 2025 17:03:56 GMT, Stefan Karlsson wrote: > EOD here but got notified about this change. A few quick comments: > > 1. Could you fix the whitespace issues GH is complaining about? Fixed > 2. There's a significant amount of duplication of these conspicuous additions. Could you you create a constexpr function and use that instead? Refactored code to remove duplication. > 3. It doesn't seem prudent to put the new testing code in our test_zForwarding.cpp file. Could this be in its own test file? I think the changes are minor to assert the tagging on allocation while we are doing that and do not require extra tests, please let me know if you still think otherwise. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27868#issuecomment-3487603064 From mdoerr at openjdk.org Tue Nov 4 19:59:30 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 4 Nov 2025 19:59:30 GMT Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access [v6] In-Reply-To: <-y-NiC9PnzyWtkppcs3ffnYGeWucnYSOqrtWOChOFNs=.904d1e0f-dfb0-4427-af71-c2fad0355aba@github.com> References: <-y-NiC9PnzyWtkppcs3ffnYGeWucnYSOqrtWOChOFNs=.904d1e0f-dfb0-4427-af71-c2fad0355aba@github.com> Message-ID: On Tue, 4 Nov 2025 15:39:12 GMT, Jorn Vernee wrote: >> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments Patricio > > Thanks for the reviews! @JornVernee: The new test has failed on AIX: [fork] FATAL ERROR in native method: Wrong object class or methodID passed to JNI call [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidStateRaw(java.base at 26-internal/MemorySessionImpl.java:206) [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidState(java.base at 26-internal/MemorySessionImpl.java:215) [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeInternal(java.base at 26-internal/SegmentFactories.java:189) [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeSegment(java.base at 26-internal/SegmentFactories.java:181) [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:56) [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:31) [fork] at java.lang.foreign.SegmentAllocator.allocate(java.base at 26-internal/SegmentAllocator.java:644) [fork] at TestSharedCloseJvmti$EventDuringScopedAccessRunner.(TestSharedCloseJvmti.java:75) Should I file a new issue? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27919#issuecomment-3487797511 From sspitsyn at openjdk.org Tue Nov 4 20:45:37 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 20:45:37 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v13] In-Reply-To: References: Message-ID: <0Z0x2IAQ1TayXTP7kAm9U3Yyx6A4rw88m7Kjgen6bAY=.15663b42-96db-444f-9e2d-2efcbe4dd94d@github.com> On Tue, 4 Nov 2025 15:45:19 GMT, Patricio Chilano Mateo wrote: > @sspitsyn maybe you could take a look at this please? It looks good in general. I'm still looking at it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3487943741 From sspitsyn at openjdk.org Tue Nov 4 20:49:19 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 20:49:19 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 15:47:31 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > fix to JvmtiHideSingleStepping test/hotspot/jtreg/serviceability/jvmti/vthread/SingleStepKlassInit/libSingleStepKlassInit.cpp line 38: > 36: SingleStep(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, > 37: jmethodID method, jlocation location) { > 38: } Q: Would it make sense to verify that `SingleStep` events are posted or not? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2491981248 From sspitsyn at openjdk.org Tue Nov 4 21:12:51 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 21:12:51 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 15:47:31 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > fix to JvmtiHideSingleStepping I've reviewed the latest incremental SVC related update. It is good and nice to have in general. Thank you for adding the test! ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27802#pullrequestreview-3418644332 From fandreuzzi at openjdk.org Tue Nov 4 21:27:22 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 4 Nov 2025 21:27:22 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v4] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplicationStatistics` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: - enable - bytes to size ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/c3c9a8db..e8644c68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=02-03 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From pchilanomate at openjdk.org Tue Nov 4 21:28:15 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 21:28:15 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:10:03 GMT, Serguei Spitsyn wrote: > I've reviewed the latest incremental SVC related update. It is good and nice to have in general. Thank you for adding the test! > Great, thanks for the review Serguei! > test/hotspot/jtreg/serviceability/jvmti/vthread/SingleStepKlassInit/libSingleStepKlassInit.cpp line 38: > >> 36: SingleStep(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, >> 37: jmethodID method, jlocation location) { >> 38: } > > Q: Would it make sense to verify that `SingleStep` events are posted or not? Good idea, I added a check for it. Let me know if that works. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3488063416 PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2492084640 From pchilanomate at openjdk.org Tue Nov 4 21:28:14 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 21:28:14 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v15] In-Reply-To: References: Message-ID: > If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. > > As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. > > This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. > > ### Summary of implementation > > The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. > > If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). > > ### Notes > > `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In... Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: - Add check for SingleStep events - Merge branch 'master' into JDK-8369238 - fix to JvmtiHideSingleStepping - Suggested fix in macroAssembler_ppc.cpp - Improve comment and assert msg - More fixes from David's comments - Merge branch 'master' into JDK-8369238 - add const to references - Improve comment in anchor_mark_set_pd - More comments from Coleen - ... and 12 more: https://git.openjdk.org/jdk/compare/2f455ed1...06f85198 ------------- Changes: https://git.openjdk.org/jdk/pull/27802/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=14 Stats: 2373 lines in 102 files changed: 1928 ins; 125 del; 320 mod Patch: https://git.openjdk.org/jdk/pull/27802.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27802/head:pull/27802 PR: https://git.openjdk.org/jdk/pull/27802 From fandreuzzi at openjdk.org Tue Nov 4 21:34:45 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 4 Nov 2025 21:34:45 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> Message-ID: On Mon, 3 Nov 2025 16:59:55 GMT, Erik Gahlin wrote: > We typically try to avoid using "Bytes" in field names, since that information is already available in the content type. Perhaps something else could be used, newSize? Sure: 586f413571c7c0354e9663888c81113065d991bf > It doesn't sound that bad, the event could probably be enabled by default. I re-enabled the event in `default.jfc`: e8644c683a4290f0ae112b7c63a8a1fc1c85b27e > The elapsed fields, are they the total since the JVM started or from the last round? All fields in `EventStringDeduplicationStatistics` contain the diff since the last round: jdk.StringDeduplicationStatistics { startTime = 21:31:19.604 (2025-11-04) duration = 0.000020 ms inspected = 8424 known = 2898 shared = 1247 newStrings = 4279 newSize = 255.0 kB replaced = 0 deleted = 0 deduplicated = 3331 deduplicatedSize = 102.4 kB skippedDead = 6 skippedIncomplete = 0 skippedShared = 0 activeElapsed = 1.85 ms processElapsed = 1.85 ms idleElapsed = 191 ms resizeTableElapsed = 0 s cleanupTableElapsed = 0 s } jdk.StringDeduplicationStatistics { startTime = 21:31:19.604 (2025-11-04) duration = 0.000030 ms inspected = 1 known = 0 shared = 0 newStrings = 1 newSize = 24 bytes replaced = 0 deleted = 0 deduplicated = 0 deduplicatedSize = 0 bytes skippedDead = 0 skippedIncomplete = 0 skippedShared = 0 activeElapsed = 0.00124 ms processElapsed = 0.00100 ms idleElapsed = 0.000780 ms resizeTableElapsed = 0 s cleanupTableElapsed = 0 s } ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3488086974 From sspitsyn at openjdk.org Tue Nov 4 21:54:54 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 21:54:54 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v7] In-Reply-To: References: Message-ID: <8Sx_Zy0yHDLxhRd-D9VdD4bsS2fnQmZd20RGlrCpsFg=.bf22f172-50de-4143-b6be-752ef899e4b8@github.com> On Tue, 4 Nov 2025 13:42:26 GMT, Coleen Phillimore wrote: >> I'm keeping it verbose, as that's the general style of this file. > > Yes please I like the full names if possible rather than abbreviations. > I'm keeping it verbose, as that's the general style of this file. Okay. Local naming consistency is important too. > Yes please I like the full names if possible rather than abbreviations. Here the problem is not about full names vs abbreviations. It is about naming inconsistency with all these `BSM` related code. There are already many places with the `BSM` abbreviation. But I agree it is better to maintain the local style here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2492144260 From sspitsyn at openjdk.org Tue Nov 4 22:02:52 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 22:02:52 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v15] In-Reply-To: <-MyLNt-M1uWjNPTzmizfSNSHZjqO-MFkxJK_Dn22MQs=.68e3b8cf-10da-4c8d-a891-11a0ecf516be@github.com> References: <-MyLNt-M1uWjNPTzmizfSNSHZjqO-MFkxJK_Dn22MQs=.68e3b8cf-10da-4c8d-a891-11a0ecf516be@github.com> Message-ID: On Tue, 4 Nov 2025 14:27:57 GMT, Johan Sj?len wrote: >> Hi, >> >> This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. >> >> We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. >> >> For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. >> >> On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. >> >> Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - It's fine to initialize the iterator with null, it's not fine to reserve an entry if it's null > - Fix naming Thank you for the updates. This review became too long. I'm okay with the current state. The refactoring itself is nice to have. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27198#pullrequestreview-3418770910 From jkratochvil at openjdk.org Tue Nov 4 22:11:09 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 4 Nov 2025 22:11:09 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v3] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into clangmemset - Revert "8361288: Fix build of JTReg: wget exited with exit code 4" This reverts commit 6e6b8f6a26f8e555f1e70544546b92bbafcae6cc. - 8361288: Fix build of JTReg: wget exited with exit code 4 - 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/09a45c6d..3745a8af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=01-02 Stats: 422061 lines in 6258 files changed: 284237 ins; 95996 del; 41828 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From sspitsyn at openjdk.org Tue Nov 4 22:36:41 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 22:36:41 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v15] In-Reply-To: <-MyLNt-M1uWjNPTzmizfSNSHZjqO-MFkxJK_Dn22MQs=.68e3b8cf-10da-4c8d-a891-11a0ecf516be@github.com> References: <-MyLNt-M1uWjNPTzmizfSNSHZjqO-MFkxJK_Dn22MQs=.68e3b8cf-10da-4c8d-a891-11a0ecf516be@github.com> Message-ID: On Tue, 4 Nov 2025 14:27:57 GMT, Johan Sj?len wrote: >> Hi, >> >> This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. >> >> We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. >> >> For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. >> >> On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. >> >> Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - It's fine to initialize the iterator with null, it's not fine to reserve an entry if it's null > - Fix naming I'd recommend to additionally run mach5 tier-6. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27198#issuecomment-3488242701 From sspitsyn at openjdk.org Tue Nov 4 22:41:18 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 22:41:18 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v15] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:28:14 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Add check for SingleStep events > - Merge branch 'master' into JDK-8369238 > - fix to JvmtiHideSingleStepping > - Suggested fix in macroAssembler_ppc.cpp > - Improve comment and assert msg > - More fixes from David's comments > - Merge branch 'master' into JDK-8369238 > - add const to references > - Improve comment in anchor_mark_set_pd > - More comments from Coleen > - ... and 12 more: https://git.openjdk.org/jdk/compare/2f455ed1...06f85198 Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27802#pullrequestreview-3418852425 From sspitsyn at openjdk.org Tue Nov 4 22:41:20 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 4 Nov 2025 22:41:20 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v14] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:23:06 GMT, Patricio Chilano Mateo wrote: >> test/hotspot/jtreg/serviceability/jvmti/vthread/SingleStepKlassInit/libSingleStepKlassInit.cpp line 38: >> >>> 36: SingleStep(jvmtiEnv *jvmti, JNIEnv* jni, jthread thread, >>> 37: jmethodID method, jlocation location) { >>> 38: } >> >> Q: Would it make sense to verify that `SingleStep` events are posted or not? > > Good idea, I added a check for it. Let me know if that works. Thanks! It is good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27802#discussion_r2492228658 From pchilanomate at openjdk.org Tue Nov 4 23:35:55 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 23:35:55 GMT Subject: RFR: 8369238: Allow virtual thread preemption on some common class initialization paths [v15] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 21:28:14 GMT, Patricio Chilano Mateo wrote: >> If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. >> >> As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. >> >> This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. >> >> ### Summary of implementation >> >> The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. >> >> If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). >> >> ### Notes >> >> `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::mon... > > Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Add check for SingleStep events > - Merge branch 'master' into JDK-8369238 > - fix to JvmtiHideSingleStepping > - Suggested fix in macroAssembler_ppc.cpp > - Improve comment and assert msg > - More fixes from David's comments > - Merge branch 'master' into JDK-8369238 > - add const to references > - Improve comment in anchor_mark_set_pd > - More comments from Coleen > - ... and 12 more: https://git.openjdk.org/jdk/compare/2f455ed1...06f85198 Thanks everyone for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27802#issuecomment-3488372018 From pchilanomate at openjdk.org Tue Nov 4 23:35:58 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 4 Nov 2025 23:35:58 GMT Subject: Integrated: 8369238: Allow virtual thread preemption on some common class initialization paths In-Reply-To: References: Message-ID: On Tue, 14 Oct 2025 13:42:18 GMT, Patricio Chilano Mateo wrote: > If a thread tries to initialize a class that is already being initialized by another thread, it will block until notified. Since at this blocking point there are native frames on the stack, a virtual thread cannot be unmounted and is pinned to its carrier. Besides harming scalability, this can, in some pathological cases, lead to a deadlock, for example, if the thread executing the class initialization method is blocked waiting for some unmounted virtual thread to run, but all carriers are blocked waiting for that class to be initialized. > > As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` operations can be unmounted. Since synchronization on class initialization is implemented using `ObjectLocker`, we can reuse the same mechanism to unmount virtual threads on these cases too. > > This patch adds support for unmounting virtual threads on some of the most common class initialization paths, specifically when calling `InterpreterRuntime::_new` (`new` bytecode), and `InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or `putstatic` bytecodes. In the future we might consider extending this mechanism to include initialization calls originating from native methods such as `Class.forName0`. > > ### Summary of implementation > > The ObjectLocker class was modified to not pin the continuation if we are coming from a preemptable path, which will be the case when calling `InstanceKlass::initialize_impl` from new method `InstanceKlass::initialize_preemptable`. This means that for these cases, a virtual thread can now be unmounted either when contending for the init_lock in the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, since the call to initialize a class includes a previous call to `link_class` which also uses `ObjectLocker` to protect concurrent calls from multiple threads, we will allow preemption there too. > > If preempted, we will throw a pre-allocated exception which will get propagated with the `TRAPS/CHECK` macros all the way back to the VM entry point. The exception will be cleared and on return back to Java the virtual thread will go through the preempt stub and unmount. When running again, at the end of the thaw call we will identify this preemption case and redo the original VM call (either `InterpreterRuntime::_new` or `InterpreterRuntime::resolve_from_cache`). > > ### Notes > > `InterpreterRuntime::call_VM_preemptable` used previously only for `InterpreterRuntime::monitorenter`, was renamed to `In... This pull request has now been integrated. Changeset: c6a88155 Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/c6a88155b519a5d0b22f6009e75a0e6388601756 Stats: 2373 lines in 102 files changed: 1928 ins; 125 del; 320 mod 8369238: Allow virtual thread preemption on some common class initialization paths Co-authored-by: Alan Bateman Co-authored-by: Fei Yang Co-authored-by: Richard Reingruber Reviewed-by: sspitsyn, dholmes, coleenp, fbredberg ------------- PR: https://git.openjdk.org/jdk/pull/27802 From dlong at openjdk.org Tue Nov 4 23:50:40 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 4 Nov 2025 23:50:40 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v4] In-Reply-To: <8F0nIHGwWbZ0Z6oRxP6rXYoS-GRQEh3-LuiCa2RGvfk=.553e56b9-93cb-4aac-b89e-4ebb2f1e2169@github.com> References: <8F0nIHGwWbZ0Z6oRxP6rXYoS-GRQEh3-LuiCa2RGvfk=.553e56b9-93cb-4aac-b89e-4ebb2f1e2169@github.com> Message-ID: On Mon, 3 Nov 2025 09:29:28 GMT, Afshin Zafari wrote: >> Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. >> >> Tests: >> mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > comments and post-cond src/hotspot/share/oops/klass.hpp line 525: > 523: // So use alternate form of negation to avoid warning. > 524: uint result = candidates & (~candidates + 1); > 525: assert(((result - 1) & result) == 0, "post-condition"); Maybe use "must be power of 2" instead of "post-condition". Also, this value is never going to change. Can we make the function `constexpr`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2492344606 From qpzhang at openjdk.org Wed Nov 5 03:52:01 2025 From: qpzhang at openjdk.org (Patrick Zhang) Date: Wed, 5 Nov 2025 03:52:01 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v4] In-Reply-To: References: <5Q-u2mZQot1qUYvX2QuOi2jGTxy8kb79Hb2e6d4zHx4=.6f7af4f7-1f65-43ac-a76d-ae8a4145c3b8@github.com> Message-ID: On Tue, 28 Oct 2025 12:08:01 GMT, Andrew Haley wrote: >>> > I would like to reiterate that I have no objection to the functions when the `-XX:+UseBlockZeroing` option is set, everything can keep as is. My point is that `BlockZeroingLowLimit` serves literally/specifically as a switch to control whether DC ZVA instructions are generated for clearing instances under a specified bytes size limitation, rather than for deciding between unrolling and callout. Therefore, it should NOT affect the code-gen results any longer when `-XX:-UseBlockZeroing` is set, should it? >>> >>> It does not. When `-XX:-UseBlockZeroing` is set, `BlockZeroingLowLimit` is ignored. >> >> >> zero_words does not check `UseBlockZeroing`, it directly compares `cnt` and `BlockZeroingLowLimit / BytesPerWord`. >> >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L6198C1-L6204C16 >> >> >> address MacroAssembler::zero_words(Register base, uint64_t cnt) >> { >> assert(wordSize <= BlockZeroingLowLimit, >> "increase BlockZeroingLowLimit"); >> address result = nullptr; >> if (cnt <= (uint64_t)BlockZeroingLowLimit / BytesPerWord) { >> #ifndef PRODUCT >> >> >> In contrast, the inner stub function does so. >> >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L669 >> >> address generate_zero_blocks() { >> Label done; >> Label base_aligned; >> >> Register base = r10, cnt = r11; >> >> __ align(CodeEntryAlignment); >> StubId stub_id = StubId::stubgen_zero_blocks_id; >> StubCodeMark mark(this, stub_id); >> address start = __ pc(); >> >> if (UseBlockZeroing) { > >> > > I would like to reiterate that I have no objection to the functions when the `-XX:+UseBlockZeroing` option is set, everything can keep as is. My point is that `BlockZeroingLowLimit` serves literally/specifically as a switch to control whether DC ZVA instructions are generated for clearing instances under a specified bytes size limitation, rather than for deciding between unrolling and callout. Therefore, it should NOT affect the code-gen results any longer when `-XX:-UseBlockZeroing` is set, should it? >> > >> > >> > It does not. When `-XX:-UseBlockZeroing` is set, `BlockZeroingLowLimit` is ignored. >> >> zero_words does not check `UseBlockZeroing`, it directly compares `cnt` and `BlockZeroingLowLimit / BytesPerWord`. > > It doesn't need to because > > > if (!UseBlockZeroing && !FLAG_IS_DEFAULT(BlockZeroingLowLimit)) { > warning("BlockZeroingLowLimit has been ignored because UseBlockZeroing is disabled"); > FLAG_SET_DEFAULT(BlockZeroingLowLimit, is_zva_enabled() ? (4 * VM_Version::zva_length()) : 256); > } > > > That is to say, if a user sets `BlockZeroingLowLimit` and `-XX:-UseBlockZeroing`, then the user's `BlockZeroingLowLimit` is, rightly, ignored. Hi @theRealAph and @adinn, please let me know if you have any additional comments on this PR, or advice to improve it. Thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26917#issuecomment-3489152280 From duke at openjdk.org Wed Nov 5 05:08:57 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 5 Nov 2025 05:08:57 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Fix build - Fix test failed - 8344116: C2: remove slice parameter from LoadNode::make ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/ea83736e..6d122039 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=06-07 Stats: 526337 lines in 7522 files changed: 349612 ins; 122587 del; 54138 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From jsjolen at openjdk.org Wed Nov 5 05:09:16 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 5 Nov 2025 05:09:16 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v6] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Tue, 4 Nov 2025 19:07:32 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > minor refactoring to reduce code duplication Changes requested by jsjolen (Reviewer). test/hotspot/gtest/runtime/test_os.cpp line 744: > 742: EXPECT_TRUE(GtestUtils::is_memory_tagged_as_java(p, 1 * M)) > 743: << "JVM memory allocated via os::reserve_memory should be tagged with VM_MEMORY_JAVA on macOS"; > 744: #endif Move all of these snippets into one separate test ------------- PR Review: https://git.openjdk.org/jdk/pull/27868#pullrequestreview-3419863275 PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2492971686 From duke at openjdk.org Wed Nov 5 05:09:05 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 5 Nov 2025 05:09:05 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v6] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 13:04:12 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into 8344116 >> - Fix build >> - Fix test failed >> - 8344116: C2: remove slice parameter from LoadNode::make > > src/hotspot/share/gc/shared/c2/barrierSetC2.cpp line 223: > >> 221: MergeMemNode* mm = opt_access.mem(); >> 222: PhaseGVN& gvn = opt_access.gvn(); >> 223: Node* mem = mm->memory_at(gvn.C->get_alias_index(access.addr().type())); > > Can we get rid of all uses of `access.addr().type()`? Get rid of all access.addr().type() > src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp line 105: > >> 103: // stores. In theory we could relax the load from ctrl() to >> 104: // no_ctrl, but that doesn't buy much latitude. >> 105: Node* card_val = __ load( __ ctrl(), card_adr, TypeInt::BYTE, T_BYTE); > > We could asssert that `C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw`, that is that computed slice is the same as hardcoded slide. Similar asserts could be added for every location where a slice/address type is removed in this patch. Sure, I add more assert for this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2484816831 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2492987998 From krk at openjdk.org Wed Nov 5 08:36:30 2025 From: krk at openjdk.org (Kerem Kat) Date: Wed, 5 Nov 2025 08:36:30 GMT Subject: Integrated: 8334866: Improve Speed of ElfDecoder source search In-Reply-To: References: Message-ID: On Wed, 17 Sep 2025 10:14:09 GMT, Kerem Kat wrote: > Right now, looking up source file and line number info is slow because we do a full linear scan of the `.debug_aranges` section for every single call. This can be a major bottleneck on large binaries, especially during frequent native stack walking, e.g. while writing an hs_err. > > This change fixes that by caching the address ranges on the first lookup, and keeping it in memory for the lifetime of the `DwarfFile` object. > > All subsequent lookups on that object now use a binary search instead of the slow linear scan. If caching fails for any reason, it just falls back to the old method. This pull request has now been integrated. Changeset: dddfcd03 Author: Kerem Kat Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/dddfcd03aa30514d63eceff707d48bff35e93c56 Stats: 229 lines in 2 files changed: 216 ins; 4 del; 9 mod 8334866: Improve Speed of ElfDecoder source search Reviewed-by: shade, chagedorn ------------- PR: https://git.openjdk.org/jdk/pull/27337 From jsjolen at openjdk.org Wed Nov 5 08:37:22 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 5 Nov 2025 08:37:22 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v16] In-Reply-To: References: Message-ID: > Hi, > > This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. > > We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. > > For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. > > On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. > > Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: - Merge remote-tracking branch 'openjdk/master' into operands-again - It's fine to initialize the iterator with null, it's not fine to reserve an entry if it's null - Fix naming - Serguei comments - Revert change - Some nits - Fix copyright - Move BSMAttribute BSMAttributeEntries to own header file - Merge remote-tracking branch 'origin/operands-again' into operands-again - Apply suggestions from code review Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> - ... and 21 more: https://git.openjdk.org/jdk/compare/1e0cd337...57f0093e ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27198/files - new: https://git.openjdk.org/jdk/pull/27198/files/219ef346..57f0093e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=14-15 Stats: 278590 lines in 3431 files changed: 200223 ins; 55822 del; 22545 mod Patch: https://git.openjdk.org/jdk/pull/27198.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27198/head:pull/27198 PR: https://git.openjdk.org/jdk/pull/27198 From haosun at openjdk.org Wed Nov 5 08:40:21 2025 From: haosun at openjdk.org (Hao Sun) Date: Wed, 5 Nov 2025 08:40:21 GMT Subject: RFR: 8371205: AArch64: Remove unused cmpxchg* methods In-Reply-To: References: Message-ID: <0Z-2cj3w01eEKIJBZEMZ7FH3Hl2ECI0lCX_fv77fO_4=.ef49a25c-d70c-4dfb-9233-eec4b633d5d9@github.com> On Tue, 4 Nov 2025 13:41:04 GMT, Ruben wrote: > Since JDK-8364406, the AArch64 macroAssembler method cmpxchg_obj_header is no longer used. The method cmpxchgptr is used by cmpxchg_obj_header however is not used by any other method so can be removed alongside cmpxchgptr. > cmpxchgw is also unused and can be removed. Marked as reviewed by haosun (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28131#pullrequestreview-3420572157 From jsikstro at openjdk.org Wed Nov 5 09:32:10 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 5 Nov 2025 09:32:10 GMT Subject: RFR: 8370813: Deprecate AggressiveHeap Message-ID: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> Hello, This RFE deprecates the `AggressiveHeap` flag in JDK 26. Please see the CSR for specific details on why this flag is being deprecated and workarounds for users interested in keeping similar behavior in the future. ------------- Commit messages: - 8370813: Deprecate AggressiveHeap Changes: https://git.openjdk.org/jdk/pull/28144/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28144&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370813 Stats: 16 lines in 3 files changed: 8 ins; 6 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28144.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28144/head:pull/28144 PR: https://git.openjdk.org/jdk/pull/28144 From ayang at openjdk.org Wed Nov 5 10:16:39 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 5 Nov 2025 10:16:39 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue Message-ID: Removing effectively dead code. Test: tier1, GHA ------------- Commit messages: - remove-barrier-arg Changes: https://git.openjdk.org/jdk/pull/28146/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28146&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371321 Stats: 38 lines in 16 files changed: 0 ins; 6 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/28146.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28146/head:pull/28146 PR: https://git.openjdk.org/jdk/pull/28146 From jsikstro at openjdk.org Wed Nov 5 10:25:19 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 5 Nov 2025 10:25:19 GMT Subject: RFR: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine Message-ID: Hello, This RFE deprecates the `AlwaysActAsServerClassMachine` and `NeverActAsServrClassMachine` flags in JDK 26. Please see the CSR for specific details on why these flag are being deprecated. ------------- Commit messages: - 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine Changes: https://git.openjdk.org/jdk/pull/28148/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28148&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370843 Stats: 44 lines in 3 files changed: 22 ins; 20 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28148.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28148/head:pull/28148 PR: https://git.openjdk.org/jdk/pull/28148 From fandreuzzi at openjdk.org Wed Nov 5 10:33:04 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Wed, 5 Nov 2025 10:33:04 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: <_jy1EXbNvAcvuv1R0jwOUXuE7dQX1YSLjsnM5ijyBNM=.20d22624-f6ff-4de7-8701-7ff1536ffa5b@github.com> On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA Marked as reviewed by fandreuzzi (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/28146#pullrequestreview-3421109827 From ayang at openjdk.org Wed Nov 5 10:34:03 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 5 Nov 2025 10:34:03 GMT Subject: RFR: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 10:16:40 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE deprecates the `AlwaysActAsServerClassMachine` and `NeverActAsServrClassMachine` flags in JDK 26. Please see the CSR for specific details on why these flag are being deprecated. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28148#pullrequestreview-3421115339 From ayang at openjdk.org Wed Nov 5 10:35:04 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 5 Nov 2025 10:35:04 GMT Subject: RFR: 8370813: Deprecate AggressiveHeap In-Reply-To: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> References: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> Message-ID: On Wed, 5 Nov 2025 09:24:51 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE deprecates the `AggressiveHeap` flag in JDK 26. Please see the CSR for specific details on why this flag is being deprecated and workarounds for users interested in keeping similar behavior in the future. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28144#pullrequestreview-3421117544 From duke at openjdk.org Wed Nov 5 10:57:18 2025 From: duke at openjdk.org (duke) Date: Wed, 5 Nov 2025 10:57:18 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Tue, 4 Nov 2025 09:48:20 GMT, Ruben wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField > - Merge from the main branch > - Address review comments and fix a mistype > - Check for NOP and MOVK separately in NativePostCallNop > - Test for deoptimization in virtual threads > > Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a > - Merge from the main branch > - Address review comments > - Address review comments > - Address review comments > - The patch is contributed by @TheRealMDoerr > - ... and 5 more: https://git.openjdk.org/jdk/compare/1922c4fd...359c2f18 @ruben-arm Your change (at version 359c2f185c7add1cac98523f4325a7896e8bd3e0) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3490519353 From duke at openjdk.org Wed Nov 5 11:58:32 2025 From: duke at openjdk.org (Ruben) Date: Wed, 5 Nov 2025 11:58:32 GMT Subject: Integrated: 8365047: Remove exception handler stub code in C2 In-Reply-To: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 7 Aug 2025 15:49:20 GMT, Ruben wrote: > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. This pull request has now been integrated. Changeset: 3e3822ad Author: Ruben Ayrapetyan Committer: Evgeny Astigeevich URL: https://git.openjdk.org/jdk/commit/3e3822ad7eadbb3d86a3b94a6bd858f8c8ef9364 Stats: 569 lines in 41 files changed: 268 ins; 216 del; 85 mod 8365047: Remove exception handler stub code in C2 Co-authored-by: Martin Doerr Reviewed-by: mdoerr, dlong, dfenacci, adinn, fyang, aph ------------- PR: https://git.openjdk.org/jdk/pull/26678 From jvernee at openjdk.org Wed Nov 5 12:17:32 2025 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 5 Nov 2025 12:17:32 GMT Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access [v6] In-Reply-To: <-y-NiC9PnzyWtkppcs3ffnYGeWucnYSOqrtWOChOFNs=.904d1e0f-dfb0-4427-af71-c2fad0355aba@github.com> References: <-y-NiC9PnzyWtkppcs3ffnYGeWucnYSOqrtWOChOFNs=.904d1e0f-dfb0-4427-af71-c2fad0355aba@github.com> Message-ID: On Tue, 4 Nov 2025 15:39:12 GMT, Jorn Vernee wrote: >> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments Patricio > > Thanks for the reviews! > @JornVernee: The new test has failed on AIX: > > ``` > [fork] FATAL ERROR in native method: Wrong object class or methodID passed to JNI call > [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidStateRaw(java.base at 26-internal/MemorySessionImpl.java:206) > [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidState(java.base at 26-internal/MemorySessionImpl.java:215) > [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeInternal(java.base at 26-internal/SegmentFactories.java:189) > [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeSegment(java.base at 26-internal/SegmentFactories.java:181) > [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:56) > [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:31) > [fork] at java.lang.foreign.SegmentAllocator.allocate(java.base at 26-internal/SegmentAllocator.java:644) > [fork] at TestSharedCloseJvmti$EventDuringScopedAccessRunner.(TestSharedCloseJvmti.java:75) > ``` > > Should I file a new issue? Please file a new issue. We haven't seen this failure in our CI. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27919#issuecomment-3490884908 From coleenp at openjdk.org Wed Nov 5 12:50:26 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Nov 2025 12:50:26 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v16] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 08:37:22 GMT, Johan Sj?len wrote: >> Hi, >> >> This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. >> >> We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. >> >> For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. >> >> On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. >> >> Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. > > Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: > > - Merge remote-tracking branch 'openjdk/master' into operands-again > - It's fine to initialize the iterator with null, it's not fine to reserve an entry if it's null > - Fix naming > - Serguei comments > - Revert change > - Some nits > - Fix copyright > - Move BSMAttribute BSMAttributeEntries to own header file > - Merge remote-tracking branch 'origin/operands-again' into operands-again > - Apply suggestions from code review > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - ... and 21 more: https://git.openjdk.org/jdk/compare/edfdae41...57f0093e This is excellent work! I had a couple of small suggested changes, but am happy to approve it. src/hotspot/share/oops/bsmAttribute.hpp line 28: > 26: #define SHARE_OOPS_BSMATTRIBUTE_HPP > 27: > 28: #include "classfile/classLoaderData.hpp" I think you can forward declare ClassLoaderData rather than include the whole file here. ------------- PR Review: https://git.openjdk.org/jdk/pull/27198#pullrequestreview-3421758555 PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2494308893 From coleenp at openjdk.org Wed Nov 5 12:50:30 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Nov 2025 12:50:30 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v12] In-Reply-To: <_0HzhdWbRBZNJvB33qf8VXRnc70eYXm7NCmb6oSEllw=.482f6b91-c612-4be7-a007-29954f0f5080@github.com> References: <_0HzhdWbRBZNJvB33qf8VXRnc70eYXm7NCmb6oSEllw=.482f6b91-c612-4be7-a007-29954f0f5080@github.com> Message-ID: <8WVVrT5cKKUY1wGnTvxzj-8FFM-dZnYtuActIQRXZUQ=.0f11587c-e06f-47ee-93e4-bd7a5e7fc16f@github.com> On Wed, 8 Oct 2025 21:09:23 GMT, Serguei Spitsyn wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright > > src/hotspot/share/oops/bsmAttribute.inline.hpp line 34: > >> 32: _cur_array + BSMAttributeEntry::u2s_required(argc) > insert_into->bootstrap_methods()->length()) { >> 33: return nullptr; >> 34: } > > Nit: This check needs a comment. Also, I'd suggest to add a guarantee here instead of returning `nullptr`. I agree with this comment - is returning null going to crash somewhere down the line? Is this an overflow? > src/hotspot/share/oops/constantPool.hpp line 94: > >> 92: InstanceKlass* _pool_holder; // the corresponding class >> 93: >> 94: BSMAttributeEntries _bsmaentries; > > Nit: Suggestion to rename: `_bsmaentries` => `_bsm_entries`. This is a good suggestion for a minor change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2494322202 PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2494323677 From coleenp at openjdk.org Wed Nov 5 12:50:31 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Nov 2025 12:50:31 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v7] In-Reply-To: <8Sx_Zy0yHDLxhRd-D9VdD4bsS2fnQmZd20RGlrCpsFg=.bf22f172-50de-4143-b6be-752ef899e4b8@github.com> References: <8Sx_Zy0yHDLxhRd-D9VdD4bsS2fnQmZd20RGlrCpsFg=.bf22f172-50de-4143-b6be-752ef899e4b8@github.com> Message-ID: <9KAeG5WWY5hMK4Yjl0i3G9tCWreCYe-b8y-BeHnXSdY=.6c742a39-87f8-40e9-8725-be05bb5f36f4@github.com> On Tue, 4 Nov 2025 21:52:04 GMT, Serguei Spitsyn wrote: >> Yes please I like the full names if possible rather than abbreviations. > >> I'm keeping it verbose, as that's the general style of this file. > > Okay. Local naming consistency is important too. > >> Yes please I like the full names if possible rather than abbreviations. > > Here the problem is not about full names vs abbreviations. It is about naming inconsistency with all these `BSM` related code. There are already many places with the `BSM` abbreviation. But I agree it is better to maintain the local style here. The abbreviation in the middle of the word makes me find it hard to say in English in this case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2494315140 From roland at openjdk.org Wed Nov 5 13:23:18 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 5 Nov 2025 13:23:18 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 05:08:57 GMT, Zihao Lin wrote: >> This patch remove slice parameter from LoadNode::make >> >> I have done more work which remove slice paramater from StoreNode::make. >> >> Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 >> >> Hi team, I am new, I'd appreciate any guidance. Thank a lot! > > Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: > > - fix assert > - add more assert > - rid of access.addr().type() > - Merge branch 'openjdk:master' into 8344116 > - Merge branch 'openjdk:master' into 8344116 > - Merge branch 'openjdk:master' into 8344116 > - Fix build > - Fix test failed > - 8344116: C2: remove slice parameter from LoadNode::make Can we remove `C2AccessValuePtr` entirely and use: Node* _addr; where, currently, there's: C2AccessValuePtr& _addr; ? src/hotspot/share/opto/callnode.cpp line 1740: > 1738: Node* klass_node = in(AllocateNode::KlassNode); > 1739: Node* proto_adr = phase->transform(new AddPNode(klass_node, klass_node, phase->MakeConX(in_bytes(Klass::prototype_header_offset())))); > 1740: mark_node = LoadNode::make(*phase, control, mem, proto_adr, TypeX_X, TypeX_X->basic_type(), MemNode::unordered); We could assert that C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw ------------- PR Review: https://git.openjdk.org/jdk/pull/24258#pullrequestreview-3421940817 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2494424924 From mdoerr at openjdk.org Wed Nov 5 13:30:15 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 5 Nov 2025 13:30:15 GMT Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access [v6] In-Reply-To: References: <-y-NiC9PnzyWtkppcs3ffnYGeWucnYSOqrtWOChOFNs=.904d1e0f-dfb0-4427-af71-c2fad0355aba@github.com> Message-ID: On Wed, 5 Nov 2025 12:14:38 GMT, Jorn Vernee wrote: > > @JornVernee: The new test has failed on AIX: > > ``` > > [fork] FATAL ERROR in native method: Wrong object class or methodID passed to JNI call > > [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidStateRaw(java.base at 26-internal/MemorySessionImpl.java:206) > > [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidState(java.base at 26-internal/MemorySessionImpl.java:215) > > [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeInternal(java.base at 26-internal/SegmentFactories.java:189) > > [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeSegment(java.base at 26-internal/SegmentFactories.java:181) > > [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:56) > > [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:31) > > [fork] at java.lang.foreign.SegmentAllocator.allocate(java.base at 26-internal/SegmentAllocator.java:644) > > [fork] at TestSharedCloseJvmti$EventDuringScopedAccessRunner.(TestSharedCloseJvmti.java:75) > > ``` > > > > > > > > > > > > > > > > > > > > > > > > Should I file a new issue? > > Please file a new issue. We haven't seen this failure in our CI. Filed [JDK-8371340](https://bugs.openjdk.org/browse/JDK-8371340). Is the test supposed to work on platforms other than linux? It passes on linux PPC64 (both, big and little endian). ------------- PR Comment: https://git.openjdk.org/jdk/pull/27919#issuecomment-3491224600 From duke at openjdk.org Wed Nov 5 13:55:26 2025 From: duke at openjdk.org (duke) Date: Wed, 5 Nov 2025 13:55:26 GMT Subject: RFR: 8371205: AArch64: Remove unused cmpxchg* methods In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 13:41:04 GMT, Ruben wrote: > Since JDK-8364406, the AArch64 macroAssembler method cmpxchg_obj_header is no longer used. The method cmpxchgptr is used by cmpxchg_obj_header however is not used by any other method so can be removed alongside cmpxchgptr. > cmpxchgw is also unused and can be removed. @ruben-arm Your change (at version d87d96222c8918f982313e84e79d48c476aa7728) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28131#issuecomment-3491336974 From duke at openjdk.org Wed Nov 5 13:59:58 2025 From: duke at openjdk.org (Ruben) Date: Wed, 5 Nov 2025 13:59:58 GMT Subject: Integrated: 8371205: AArch64: Remove unused cmpxchg* methods In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 13:41:04 GMT, Ruben wrote: > Since JDK-8364406, the AArch64 macroAssembler method cmpxchg_obj_header is no longer used. The method cmpxchgptr is used by cmpxchg_obj_header however is not used by any other method so can be removed alongside cmpxchgptr. > cmpxchgw is also unused and can be removed. This pull request has now been integrated. Changeset: c9a98169 Author: Samuel Chee Committer: Fei Gao URL: https://git.openjdk.org/jdk/commit/c9a98169cb79df235316cb38a804d539044ea57e Stats: 101 lines in 2 files changed: 0 ins; 101 del; 0 mod 8371205: AArch64: Remove unused cmpxchg* methods Co-authored-by: Samuel Chee Reviewed-by: aph, kbarrett, haosun ------------- PR: https://git.openjdk.org/jdk/pull/28131 From egahlin at openjdk.org Wed Nov 5 14:18:17 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Wed, 5 Nov 2025 14:18:17 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> Message-ID: On Tue, 4 Nov 2025 21:32:19 GMT, Francesco Andreuzzi wrote: > > The elapsed fields, are they the total since the JVM started or from the last round? > > All fields in `EventStringDeduplicationStatistics` contain the diff since the last round: > Since the event has a duration, I wonder if the event should be called StringDeduplication, similar to Compilation or GarbageCollection? As I understand it, the event represents a round of deduplication. All other events called statistics are instantaneous events. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3491455070 From coleenp at openjdk.org Wed Nov 5 14:32:01 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Nov 2025 14:32:01 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v4] In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 10:05:35 GMT, Fredrik Bredberg wrote: >> This is the last PR in a series of PRs (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)) to obsolete the LockingMode flag and related code. >> >> The main focus is to to unify `ObjectSynchronizer` and `LightweightSynchronizer`. >> There used to be a number of "dispatch functions" to redirect calls depending on the setting of the `LockingMode` flag. >> Since we now only have lightweight locking, there is no longer any need for those dispatch functions, so I removed them. >> To remove the dispatch functions I renamed the corresponding lightweight functions and call them directly. >> This ultimately led me to remove "lightweight" from the function names and go back to "fast" instead, just to avoid having some with, and some without the "lightweight" part of the name. >> >> This PR also include a small simplification of `ObjectSynchronizer::FastHashCode`. >> >> Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. >> All other platforms (`arm`, `ppc`, `riscv`, `s390`) has been sanity checked using QEMU. > > Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: > > - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer > - Update two, after the review > - Update after review > - Small arm32 fix > - Small include line fix > - 8367982: Unify ObjectSynchronizer and LightweightSynchronizer I believe you've addressed all of David's comments and mine, and we can have a new PR for moving and additional work we're going to do with the ObjectMonitorTable. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27915#pullrequestreview-3422419032 From fbredberg at openjdk.org Wed Nov 5 14:39:22 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 5 Nov 2025 14:39:22 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v2] In-Reply-To: <_96H3JAsfT8uJ5oY7PEtsiX4tNZPy0VRgDRW6YjfBCo=.aa04fd8b-9520-463a-9677-a73a52244ef3@github.com> References: <_96H3JAsfT8uJ5oY7PEtsiX4tNZPy0VRgDRW6YjfBCo=.aa04fd8b-9520-463a-9677-a73a52244ef3@github.com> Message-ID: On Fri, 31 Oct 2025 13:52:53 GMT, Fredrik Bredberg wrote: >> src/hotspot/share/runtime/synchronizer.cpp line 287: >> >>> 285: _last_async_deflation_time_ns = os::javaTimeNanos(); >>> 286: >>> 287: ObjectSynchronizer::create_om_table(); >> >> The original code should effectively be inlined here: >> >> if (UseObjectMonitorTable) { >> ObjectMonitorTable::create(); >> } > > Tried to do a quick fix for this, but `ObjectMonitorTable` is not known at this point, and forward declaring it turned out to be a mess. So, since we agreed to move out `ObjectMonitorTable` to a separate file, I think this can be postponed until that file has been created. Created: [JDK-8371347](https://bugs.openjdk.org/browse/JDK-8371347) "Move the ObjectMonitorTable to a separate new file" >> src/hotspot/share/runtime/synchronizer.cpp line 1838: >> >>> 1836: if (!UseObjectMonitorTable) { >>> 1837: return; >>> 1838: } >> >> This should move to the caller and be replaced with an assertion at this level. Though you don't need to introduce this method at all as the caller can call `OMT::create` directly. > > Same as last answer, this is easily done once we have moved `ObjectMonitorTable` to a separate file, but not until then. Created: [JDK-8371347](https://bugs.openjdk.org/browse/JDK-8371347) "Move the ObjectMonitorTable to a separate new file" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27915#discussion_r2494806413 PR Review Comment: https://git.openjdk.org/jdk/pull/27915#discussion_r2494808134 From fbredberg at openjdk.org Wed Nov 5 14:39:24 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 5 Nov 2025 14:39:24 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v4] In-Reply-To: References: <8Rwv4Cs35RINL9l1YVBYNZmbc6YZNE3C5lO21ACBR3c=.004cf158-a586-4bb2-b22b-81df349b1bdd@github.com> <00xzxB3fxjSbmCZCQCu_ZEClnyxq2yfPfF-9SKJXoIc=.b45bec0f-164d-4604-87e2-d69ce072533b@github.com> Message-ID: <7ZEE720IklvXdhfxVI_Bxz2IwcY_CoOwPJHtk9PZ1bw=.deca3dd9-c7d6-4fbc-865d-470cd2cf96b9@github.com> On Mon, 27 Oct 2025 12:47:56 GMT, Coleen Phillimore wrote: >> I agree on both counts: move it to a new file in a new PR. > > Follow-on cleanup would be fine for moving this. Created: [JDK-8371347](https://bugs.openjdk.org/browse/JDK-8371347) "Move the ObjectMonitorTable to a separate new file" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27915#discussion_r2494797881 From stefank at openjdk.org Wed Nov 5 14:46:54 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 5 Nov 2025 14:46:54 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v6] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Tue, 4 Nov 2025 19:07:32 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > minor refactoring to reduce code duplication > I think the changes are minor to assert the tagging on allocation while we are doing that and do not require extra tests, please let me know if you still think otherwise. I think otherwise. Please put the test somewhere else. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27868#pullrequestreview-3422524897 From aboldtch at openjdk.org Wed Nov 5 15:03:57 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 5 Nov 2025 15:03:57 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v6] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Tue, 4 Nov 2025 19:07:32 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > minor refactoring to reduce code duplication src/hotspot/os/bsd/os_bsd.hpp line 38: > 36: // Shared constant for mmap file descriptor used across BSD OS implementations > 37: static constexpr int bsd_mmap_fd = > 38: #ifdef __APPLE__ Are these defines always available (present and future)? Or should they be guarded? Suggestion: #if defined(__APPLE__) && defined(VM_MAKE_TAG) && defined(VM_MEMORY_JAVA) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2494939644 From kvn at openjdk.org Wed Nov 5 15:52:07 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 5 Nov 2025 15:52:07 GMT Subject: RFR: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine In-Reply-To: References: Message-ID: <6hEjGUOSdQZLd59jx1J-HI5n0-54TlN469u5_2F5HBU=.00ea17df-a97f-457f-8040-fc2392d96075@github.com> On Wed, 5 Nov 2025 10:16:40 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE deprecates the `AlwaysActAsServerClassMachine` and `NeverActAsServrClassMachine` flags in JDK 26. Please see the CSR for specific details on why these flag are being deprecated. Good. @jsikstro do we have any test which use these flags? ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28148#pullrequestreview-3422967622 PR Comment: https://git.openjdk.org/jdk/pull/28148#issuecomment-3492001894 From bulasevich at openjdk.org Wed Nov 5 16:03:41 2025 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 5 Nov 2025 16:03:41 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v2] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 13:46:22 GMT, Fredrik Bredberg wrote: >> src/hotspot/share/runtime/abstract_vm_version.hpp line 195: >> >>> 193: >>> 194: // Is recursive fast locking implemented for this platform? >>> 195: constexpr static bool supports_recursive_fast_locking() { return false; } >> >> Next cleanup: this is supported on all platforms now, so we can get rid of this migration aid. > > Not sure we can do that, since I don't find any implementation of recursive fast locking on ARM32. > @bulasevich Any comment on this? Recursive lightweight locking (JDK-8319796) was implemented for x86, AArch64, PPC64LE, RISC-V64, and S390, but not for ARM32. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27915#discussion_r2495179254 From jvernee at openjdk.org Wed Nov 5 16:11:14 2025 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 5 Nov 2025 16:11:14 GMT Subject: RFR: 8370344: Arbitrary Java frames on stack during scoped access [v6] In-Reply-To: References: <-y-NiC9PnzyWtkppcs3ffnYGeWucnYSOqrtWOChOFNs=.904d1e0f-dfb0-4427-af71-c2fad0355aba@github.com> Message-ID: On Wed, 5 Nov 2025 12:14:38 GMT, Jorn Vernee wrote: >> Thanks for the reviews! > >> @JornVernee: The new test has failed on AIX: >> >> ``` >> [fork] FATAL ERROR in native method: Wrong object class or methodID passed to JNI call >> [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidStateRaw(java.base at 26-internal/MemorySessionImpl.java:206) >> [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidState(java.base at 26-internal/MemorySessionImpl.java:215) >> [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeInternal(java.base at 26-internal/SegmentFactories.java:189) >> [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeSegment(java.base at 26-internal/SegmentFactories.java:181) >> [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:56) >> [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:31) >> [fork] at java.lang.foreign.SegmentAllocator.allocate(java.base at 26-internal/SegmentAllocator.java:644) >> [fork] at TestSharedCloseJvmti$EventDuringScopedAccessRunner.(TestSharedCloseJvmti.java:75) >> ``` >> >> Should I file a new issue? > > Please file a new issue. We haven't seen this failure in our CI. > > > @JornVernee: The new test has failed on AIX: > > > ``` > > > [fork] FATAL ERROR in native method: Wrong object class or methodID passed to JNI call > > > [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidStateRaw(java.base at 26-internal/MemorySessionImpl.java:206) > > > [fork] at jdk.internal.foreign.MemorySessionImpl.checkValidState(java.base at 26-internal/MemorySessionImpl.java:215) > > > [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeInternal(java.base at 26-internal/SegmentFactories.java:189) > > > [fork] at jdk.internal.foreign.SegmentFactories.allocateNativeSegment(java.base at 26-internal/SegmentFactories.java:181) > > > [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:56) > > > [fork] at jdk.internal.foreign.ArenaImpl.allocate(java.base at 26-internal/ArenaImpl.java:31) > > > [fork] at java.lang.foreign.SegmentAllocator.allocate(java.base at 26-internal/SegmentAllocator.java:644) > > > [fork] at TestSharedCloseJvmti$EventDuringScopedAccessRunner.(TestSharedCloseJvmti.java:75) > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Should I file a new issue? > > > > > > Please file a new issue. We haven't seen this failure in our CI. > > Filed [JDK-8371340](https://bugs.openjdk.org/browse/JDK-8371340). Is the test supposed to work on platforms other than linux? It passes on linux PPC64 (both, big and little endian). Yes, it is supposed to work on all platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27919#issuecomment-3492077297 From fbredberg at openjdk.org Wed Nov 5 17:35:16 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 5 Nov 2025 17:35:16 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v5] In-Reply-To: References: Message-ID: > This is the last PR in a series of PRs (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)) to obsolete the LockingMode flag and related code. > > The main focus is to to unify `ObjectSynchronizer` and `LightweightSynchronizer`. > There used to be a number of "dispatch functions" to redirect calls depending on the setting of the `LockingMode` flag. > Since we now only have lightweight locking, there is no longer any need for those dispatch functions, so I removed them. > To remove the dispatch functions I renamed the corresponding lightweight functions and call them directly. > This ultimately led me to remove "lightweight" from the function names and go back to "fast" instead, just to avoid having some with, and some without the "lightweight" part of the name. > > This PR also include a small simplification of `ObjectSynchronizer::FastHashCode`. > > Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. > All other platforms (`arm`, `ppc`, `riscv`, `s390`) has been sanity checked using QEMU. Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer - Update two, after the review - Update after review - Small arm32 fix - Small include line fix - 8367982: Unify ObjectSynchronizer and LightweightSynchronizer ------------- Changes: https://git.openjdk.org/jdk/pull/27915/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27915&range=04 Stats: 2981 lines in 80 files changed: 1263 ins; 1429 del; 289 mod Patch: https://git.openjdk.org/jdk/pull/27915.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27915/head:pull/27915 PR: https://git.openjdk.org/jdk/pull/27915 From mullan at openjdk.org Wed Nov 5 17:50:23 2025 From: mullan at openjdk.org (Sean Mullan) Date: Wed, 5 Nov 2025 17:50:23 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 16:38:49 GMT, Volodymyr Paprotski wrote: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" Nice speedup. This improvement seems worthy of a release note. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3492562302 From aboldtch at openjdk.org Wed Nov 5 19:25:58 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 5 Nov 2025 19:25:58 GMT Subject: RFR: 8371346: ZGC: Flexible heap base selection Message-ID: ZGC reserves a virtual address range for its heap with one high order bit set which is referred to as the heap base. Internally we then often represent heap addresses as offset from this heap base. Currently we select one specific heap base at the start based on MaxHeapSize and the current system properties. With instrumented builds, or custom launchers it may be that we are unable to reserve a usable address range using that heap base. Currently we just give up if this happens and exits the VM. This is problematic when using instrumented builds such as ASAN where there are certain address ranges it uses which often clash with the default ZGC heap base. I propose that we are more flexible when selecting the heap base, and we start as we do today at our preferred location, but are able to retry other compatible heap bases within some broader limits. The implementation will now start at the recommended or required heap base which ever is larger and try to first reserve the desired reservation size (normally 16 * MaxHeapSize). If no heap base can accommodate this desired size, it will attempt to find at least the required size and use that. On linux x86_64 we will now also probe for the heap base rather than hard coding the max heap base as we did previously. This is beneficial when there are address space restrictions (such as with ASAN), and when there are none, we only do a couple of extra system calls at most. There are some changes to the gc+init logging. The ZAddressOffsetMax is adjusted to always be a correct upper bound. And the exit path when reservation fails is clean up, so that we exit early when we know that the external virtual memory limits will prohibit the heap reservation. Performance testing show no significant differences. Testing: * GHA * Running ZGC tier1-8 on Oracle supported platforms ------------- Commit messages: - Initial Test Implementation - Initial implementation flexible heap base - Constrain ZAddressOffsetMax correctly when multi-partition fails - Log reserved size correctly when multi-partition fails - Cleanup headers - Consistent types zAddress Changes: https://git.openjdk.org/jdk/pull/28161/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28161&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371346 Stats: 1234 lines in 24 files changed: 907 ins; 290 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/28161.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28161/head:pull/28161 PR: https://git.openjdk.org/jdk/pull/28161 From lmesnik at openjdk.org Wed Nov 5 20:56:13 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 5 Nov 2025 20:56:13 GMT Subject: RFR: 8371367: Replace remaining JvmtiJavaThreadEventTransition with JVMTI_JAVA_THREAD_EVENT_CALLBACK_BLOCK Message-ID: The one JvmtiJavaThreadEventTransition mark should be replaced with macro. Grepped that there are not JvmtiJavaThreadEventTransition and JvmtiThreadEventTransition mark are used except vm_death. ------------- Commit messages: - 8371367: Replace remaining JvmtiJavaThreadEventTransition with JVMTI_JAVA_THREAD_EVENT_CALLBACK_BLOCK Changes: https://git.openjdk.org/jdk/pull/28165/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28165&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371367 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28165.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28165/head:pull/28165 PR: https://git.openjdk.org/jdk/pull/28165 From pchilanomate at openjdk.org Wed Nov 5 21:05:11 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 5 Nov 2025 21:05:11 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v5] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 17:35:16 GMT, Fredrik Bredberg wrote: >> This is the last PR in a series of PRs (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)) to obsolete the LockingMode flag and related code. >> >> The main focus is to to unify `ObjectSynchronizer` and `LightweightSynchronizer`. >> There used to be a number of "dispatch functions" to redirect calls depending on the setting of the `LockingMode` flag. >> Since we now only have lightweight locking, there is no longer any need for those dispatch functions, so I removed them. >> To remove the dispatch functions I renamed the corresponding lightweight functions and call them directly. >> This ultimately led me to remove "lightweight" from the function names and go back to "fast" instead, just to avoid having some with, and some without the "lightweight" part of the name. >> >> This PR also include a small simplification of `ObjectSynchronizer::FastHashCode`. >> >> Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. >> All other platforms (`arm`, `ppc`, `riscv`, `s390`) has been sanity checked using QEMU. > > Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer > - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer > - Update two, after the review > - Update after review > - Small arm32 fix > - Small include line fix > - 8367982: Unify ObjectSynchronizer and LightweightSynchronizer Looks good to me, thanks. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27915#pullrequestreview-3424388594 From mr at openjdk.org Wed Nov 5 21:55:02 2025 From: mr at openjdk.org (Mark Reinhold) Date: Wed, 5 Nov 2025 21:55:02 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v4] In-Reply-To: References: Message-ID: On Wed, 29 Oct 2025 21:07:51 GMT, Mat Carter wrote: >> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. >> >> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE >> >> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: >> >> TRUE >> FALSE >> >> Passes tier1 on linux (x64) and windows (x64) > > Mat Carter has updated the pull request incrementally with one additional commit since the last revision: > > Updated test based on comments Changes requested by mr (Lead). src/jdk.management/share/classes/jdk/management/HotSpotAOTCacheMXBean.java line 78: > 76: * specification of the corresponding JVM command-line options, please refer > 77: * to https://openjdk.org/jeps/483 and https://openjdk.org/jeps/514. > 78: * Please don't use bare URLs. Change these to ... please refer to JEPs 483 and 514. ------------- PR Review: https://git.openjdk.org/jdk/pull/28010#pullrequestreview-3424599728 PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2496294936 From jrose at openjdk.org Wed Nov 5 23:10:01 2025 From: jrose at openjdk.org (John R Rose) Date: Wed, 5 Nov 2025 23:10:01 GMT Subject: RFR: 8371104: gtests should use wrappers for and In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 18:01:36 GMT, Kim Barrett wrote: > Please review this trivial change, updating HotSpot gtests to include the new > cppstdlib/{limits,type_traits}.hpp wrappers instead of including the Standard > Library headers directly. > > Testing: mach5 tier1 Marked as reviewed by jrose (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28114#pullrequestreview-3424899657 From fandreuzzi at openjdk.org Thu Nov 6 01:25:00 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 6 Nov 2025 01:25:00 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v5] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplicationStatistics` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: no start ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/e8644c68..3793befd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Thu Nov 6 01:59:41 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 6 Nov 2025 01:59:41 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v6] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: rename. start/end time ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/3793befd..090c02bc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=04-05 Stats: 12 lines in 6 files changed: 5 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Thu Nov 6 01:59:42 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 6 Nov 2025 01:59:42 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> Message-ID: <9Ne0a7oeIyXjdA_VFbfbX52u7WRgTonCp_jI906V6DQ=.8e10796c-6160-4e08-9595-1b35491dcda0@github.com> On Wed, 5 Nov 2025 14:15:07 GMT, Erik Gahlin wrote: > > > The elapsed fields, are they the total since the JVM started or from the last round? > > > > > > All fields in `EventStringDeduplicationStatistics` contain the diff since the last round: > > Since the event has a duration, I wonder if the event should be called StringDeduplication, similar to Compilation or GarbageCollection? As I understand it, the event represents a round of deduplications. All other events called statistics are instantaneous events. Yeah this makes sense, thanks. See 090c02bce5ba79cff378bd48de0fc0849f532250 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3494458707 From jpai at openjdk.org Thu Nov 6 07:03:01 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Thu, 6 Nov 2025 07:03:01 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. Hello Justin, `jimage` code is maintained in core-libs area. I have added now that label to this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28087#issuecomment-3495396990 From tschatzl at openjdk.org Thu Nov 6 08:07:03 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 6 Nov 2025 08:07:03 GMT Subject: RFR: 8371104: gtests should use wrappers for and In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 18:01:36 GMT, Kim Barrett wrote: > Please review this trivial change, updating HotSpot gtests to include the new > cppstdlib/{limits,type_traits}.hpp wrappers instead of including the Standard > Library headers directly. > > Testing: mach5 tier1 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28114#pullrequestreview-3426776962 From shade at openjdk.org Thu Nov 6 09:51:03 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Nov 2025 09:51:03 GMT Subject: RFR: 8370813: Deprecate AggressiveHeap In-Reply-To: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> References: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> Message-ID: On Wed, 5 Nov 2025 09:24:51 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE deprecates the `AggressiveHeap` flag in JDK 26. Please see the CSR for specific details on why this flag is being deprecated and workarounds for users interested in keeping similar behavior in the future. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28144#pullrequestreview-3427233045 From kbarrett at openjdk.org Thu Nov 6 10:04:04 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 6 Nov 2025 10:04:04 GMT Subject: RFR: 8371104: gtests should use wrappers for and In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 23:07:48 GMT, John R Rose wrote: >> Please review this trivial change, updating HotSpot gtests to include the new >> cppstdlib/{limits,type_traits}.hpp wrappers instead of including the Standard >> Library headers directly. >> >> Testing: mach5 tier1 > > Marked as reviewed by jrose (Reviewer). Thanks for reviews @rose00 and @tschatzl ------------- PR Comment: https://git.openjdk.org/jdk/pull/28114#issuecomment-3496292862 From kbarrett at openjdk.org Thu Nov 6 10:17:31 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 6 Nov 2025 10:17:31 GMT Subject: Integrated: 8371104: gtests should use wrappers for and In-Reply-To: References: Message-ID: On Mon, 3 Nov 2025 18:01:36 GMT, Kim Barrett wrote: > Please review this trivial change, updating HotSpot gtests to include the new > cppstdlib/{limits,type_traits}.hpp wrappers instead of including the Standard > Library headers directly. > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: 913c973c Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/913c973ca0ffdc19171a56550e8a8f03ac7f4771 Stats: 31 lines in 9 files changed: 11 ins; 20 del; 0 mod 8371104: gtests should use wrappers for and Reviewed-by: jrose, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/28114 From eosterlund at openjdk.org Thu Nov 6 11:38:35 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 6 Nov 2025 11:38:35 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v15] In-Reply-To: References: Message-ID: > This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. > > The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. > > This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. > > The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. > > The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. > > Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. > Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the order they are laid out in... Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Comment update - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - remove include - Interned string value word accounting - Dont load all objects when JVMTI CFLH is on - Remove duplicate string dedup disabling when dumping - Accept interned strings sharing value with another string - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - ... and 21 more: https://git.openjdk.org/jdk/compare/b0536f9c...afdb11ee ------------- Changes: https://git.openjdk.org/jdk/pull/27732/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27732&range=14 Stats: 8721 lines in 106 files changed: 5943 ins; 2318 del; 460 mod Patch: https://git.openjdk.org/jdk/pull/27732.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27732/head:pull/27732 PR: https://git.openjdk.org/jdk/pull/27732 From azafari at openjdk.org Thu Nov 6 12:09:06 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 6 Nov 2025 12:09:06 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v4] In-Reply-To: References: <8F0nIHGwWbZ0Z6oRxP6rXYoS-GRQEh3-LuiCa2RGvfk=.553e56b9-93cb-4aac-b89e-4ebb2f1e2169@github.com> Message-ID: On Tue, 4 Nov 2025 04:19:13 GMT, Kim Barrett wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> comments and post-cond > > src/hotspot/share/oops/klass.hpp line 514: > >> 512: } >> 513: >> 514: // Find the right-most non-zero (e.g., ...1000) bit of the diff of array-of-boolean and array-of-byte layout helpers. > > Callers don't care whether it's the rightmost bit, only that it's a single > bit. (Some callers use log2_exact to get the bit position.) So a more > pedantically correct description might be something like "Return a value > containing a single set bit that is in the bitset difference between the > layout helpers for array-of-boolean and array-of-byte." Comment of the function is replaced with this one. > src/hotspot/share/oops/klass.hpp line 525: > >> 523: // So use alternate form of negation to avoid warning. >> 524: uint result = candidates & (~candidates + 1); >> 525: assert(((result - 1) & result) == 0, "post-condition"); > > Use `power_of_2(result)`. For completeness, also consider checking other post-conditions - result is set > in zlh and clear in blh. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2498695917 PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2498696841 From azafari at openjdk.org Thu Nov 6 12:09:08 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 6 Nov 2025 12:09:08 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v4] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 04:26:06 GMT, Kim Barrett wrote: >> src/hotspot/share/oops/klass.hpp line 518: >> >>> 516: static int layout_helper_boolean_diffbit() { >>> 517: uint zlh = checked_cast(array_layout_helper(T_BOOLEAN)); >>> 518: uint blh = checked_cast(array_layout_helper(T_BYTE)); >> >> Use of check_cast is probably wrong. I think an alh is negative. Oops, my mistake. It probably doesn't fail currently because of [JDK-8314258](https://bugs.openjdk.org/browse/JDK-8314258). > > Note that by "my mistake" I meant it was a mistake to use `checked_cast` here. static_cast is used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2498695108 From azafari at openjdk.org Thu Nov 6 12:09:11 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 6 Nov 2025 12:09:11 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v4] In-Reply-To: References: <8F0nIHGwWbZ0Z6oRxP6rXYoS-GRQEh3-LuiCa2RGvfk=.553e56b9-93cb-4aac-b89e-4ebb2f1e2169@github.com> Message-ID: <0Gv-2oiMM0k7lYJMZURyodUIa4wUTvzRAe361IqJOcc=.ddb98fce-788b-4d9e-88be-aed325f31f4a@github.com> On Tue, 4 Nov 2025 23:47:40 GMT, Dean Long wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> comments and post-cond > > src/hotspot/share/oops/klass.hpp line 525: > >> 523: // So use alternate form of negation to avoid warning. >> 524: uint result = candidates & (~candidates + 1); >> 525: assert(((result - 1) & result) == 0, "post-condition"); > > Maybe use "must be power of 2" instead of "post-condition". Also, this value is never going to change. Can we make the function `constexpr`? Assert's message updated. `constexpr` cannot be used unless all the called functions (alh and its descendents) are also `constexpr`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2498708371 From fbredberg at openjdk.org Thu Nov 6 12:19:20 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 6 Nov 2025 12:19:20 GMT Subject: RFR: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer [v5] In-Reply-To: References: Message-ID: <0Bp5MHdJNB3D3XZhG85m_clr-urKBKJ4StjZifsDTRg=.1ecc9574-3493-4a3e-8d7f-4aba5b9495ef@github.com> On Wed, 5 Nov 2025 17:35:16 GMT, Fredrik Bredberg wrote: >> This is the last PR in a series of PRs (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)) to obsolete the LockingMode flag and related code. >> >> The main focus is to to unify `ObjectSynchronizer` and `LightweightSynchronizer`. >> There used to be a number of "dispatch functions" to redirect calls depending on the setting of the `LockingMode` flag. >> Since we now only have lightweight locking, there is no longer any need for those dispatch functions, so I removed them. >> To remove the dispatch functions I renamed the corresponding lightweight functions and call them directly. >> This ultimately led me to remove "lightweight" from the function names and go back to "fast" instead, just to avoid having some with, and some without the "lightweight" part of the name. >> >> This PR also include a small simplification of `ObjectSynchronizer::FastHashCode`. >> >> Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. >> All other platforms (`arm`, `ppc`, `riscv`, `s390`) has been sanity checked using QEMU. > > Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer > - Merge branch 'master' into 8367982_unify_object_and_lightweight_synchronizer > - Update two, after the review > - Update after review > - Small arm32 fix > - Small include line fix > - 8367982: Unify ObjectSynchronizer and LightweightSynchronizer Thank you for the reviews. Now let's... ------------- PR Comment: https://git.openjdk.org/jdk/pull/27915#issuecomment-3496959325 From fbredberg at openjdk.org Thu Nov 6 12:19:21 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 6 Nov 2025 12:19:21 GMT Subject: Integrated: 8367982: Unify ObjectSynchronizer and LightweightSynchronizer In-Reply-To: References: Message-ID: On Tue, 21 Oct 2025 13:11:45 GMT, Fredrik Bredberg wrote: > This is the last PR in a series of PRs (see: [JDK-8344261](https://bugs.openjdk.org/browse/JDK-8344261)) to obsolete the LockingMode flag and related code. > > The main focus is to to unify `ObjectSynchronizer` and `LightweightSynchronizer`. > There used to be a number of "dispatch functions" to redirect calls depending on the setting of the `LockingMode` flag. > Since we now only have lightweight locking, there is no longer any need for those dispatch functions, so I removed them. > To remove the dispatch functions I renamed the corresponding lightweight functions and call them directly. > This ultimately led me to remove "lightweight" from the function names and go back to "fast" instead, just to avoid having some with, and some without the "lightweight" part of the name. > > This PR also include a small simplification of `ObjectSynchronizer::FastHashCode`. > > Tested tier1-7 (on supported platforms) without seeing any problems that can be traced to this code change. > All other platforms (`arm`, `ppc`, `riscv`, `s390`) has been sanity checked using QEMU. This pull request has now been integrated. Changeset: 3930b1d4 Author: Fredrik Bredberg URL: https://git.openjdk.org/jdk/commit/3930b1d4ddda9d56d0fb3626421283c72f4ad7f9 Stats: 2981 lines in 80 files changed: 1263 ins; 1429 del; 289 mod 8367982: Unify ObjectSynchronizer and LightweightSynchronizer Reviewed-by: pchilanomate, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/27915 From bulasevich at openjdk.org Thu Nov 6 12:59:24 2025 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 6 Nov 2025 12:59:24 GMT Subject: RFR: 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster [v2] In-Reply-To: References: <9KmPqhIXB8_05Tu4NQP7TECtTqtQKs0ZvXWxI58C1TU=.5c488724-4d57-4d4f-aa2e-bdecab6a91af@github.com> Message-ID: On Thu, 30 Oct 2025 11:29:40 GMT, Boris Ulasevich wrote: >> This change adjusts the default selection of SHA-3 intrinsics on AArch64 based on observed performance across CPUs. In our measurements, the SHA-3 SIMD path (using SHA3 instructions) is consistently faster on Apple silicon, while on Neoverse and several older cores the GPR implementation performs better. On CPUs without SHA-3 instructions, the GPR path is the only viable option and behaves as expected. >> >> Accordingly, `UseSIMDForSHA3Intrinsic` now defaults to false globally. The SIMD variant is auto-enabled only on Apple silicon; elsewhere the default remains the GPR path. >> >> _The attached raw data also includes observations about `UseFPUForSpilling`. Back in #27350 we discussed whether the option is entirely useless. While orthogonal to this change, the MessageDigests benchmark is a convenient probe of register-spilling behavior because the SHA-3 (Keccak) algorithm is highly register-hungry, which adds a significant number of spills to the generated assembly sequence. In the provided results, at least one CPU benefits from enabling UseFPUForSpilling, so the option seems worth keeping for now._ >> >> **Cortex-A53 (RPi3)** >> >> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest >> Benchmark (digesterName) (length) Cnt Score Error Units >> MessageDigests.digest SHA3-512 64 150 345.010 ? 0.473 ops/ms >> MessageDigests.digest SHA3-512 16384 150 1.817 ? 0.001 ops/ms >> >> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest >> MessageDigests.digest SHA3-512 64 150 352.247 ? 0.279 ops/ms +UseFPUForSpilling: +2% >> MessageDigests.digest SHA3-512 16384 150 1.855 ? 0.001 ops/ms +UseFPUForSpilling: +2% >> >> $ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5 >> Benchmark (digesterName) (length) Cnt Score Error Units >> MessageDigests.digest SHA3-512 64 15 345.552 ? 0.291 ops/ms >> MessageDigests.digest SHA3-512 16384 15 1.818 ? 0.001 ops/ms >> MessageDigests.getAndDigest SHA3-512 64 15 265.744 ? 56.591 ops/ms >> MessageD... > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > minor review corrections thanks for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27726#issuecomment-3497141039 From bulasevich at openjdk.org Thu Nov 6 12:59:26 2025 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 6 Nov 2025 12:59:26 GMT Subject: Integrated: 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster In-Reply-To: <9KmPqhIXB8_05Tu4NQP7TECtTqtQKs0ZvXWxI58C1TU=.5c488724-4d57-4d4f-aa2e-bdecab6a91af@github.com> References: <9KmPqhIXB8_05Tu4NQP7TECtTqtQKs0ZvXWxI58C1TU=.5c488724-4d57-4d4f-aa2e-bdecab6a91af@github.com> Message-ID: On Thu, 9 Oct 2025 13:26:51 GMT, Boris Ulasevich wrote: > This change adjusts the default selection of SHA-3 intrinsics on AArch64 based on observed performance across CPUs. In our measurements, the SHA-3 SIMD path (using SHA3 instructions) is consistently faster on Apple silicon, while on Neoverse and several older cores the GPR implementation performs better. On CPUs without SHA-3 instructions, the GPR path is the only viable option and behaves as expected. > > Accordingly, `UseSIMDForSHA3Intrinsic` now defaults to false globally. The SIMD variant is auto-enabled only on Apple silicon; elsewhere the default remains the GPR path. > > _The attached raw data also includes observations about `UseFPUForSpilling`. Back in #27350 we discussed whether the option is entirely useless. While orthogonal to this change, the MessageDigests benchmark is a convenient probe of register-spilling behavior because the SHA-3 (Keccak) algorithm is highly register-hungry, which adds a significant number of spills to the generated assembly sequence. In the provided results, at least one CPU benefits from enabling UseFPUForSpilling, so the option seems worth keeping for now._ > > **Cortex-A53 (RPi3)** > > $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest > Benchmark (digesterName) (length) Cnt Score Error Units > MessageDigests.digest SHA3-512 64 150 345.010 ? 0.473 ops/ms > MessageDigests.digest SHA3-512 16384 150 1.817 ? 0.001 ops/ms > > $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest > MessageDigests.digest SHA3-512 64 150 352.247 ? 0.279 ops/ms +UseFPUForSpilling: +2% > MessageDigests.digest SHA3-512 16384 150 1.855 ? 0.001 ops/ms +UseFPUForSpilling: +2% > > $ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5 > Benchmark (digesterName) (length) Cnt Score Error Units > MessageDigests.digest SHA3-512 64 15 345.552 ? 0.291 ops/ms > MessageDigests.digest SHA3-512 16384 15 1.818 ? 0.001 ops/ms > MessageDigests.getAndDigest SHA3-512 64 15 265.744 ? 56.591 ops/ms > MessageDigests.getAndDigest SHA3-512 16384 1... This pull request has now been integrated. Changeset: c173d416 Author: Boris Ulasevich URL: https://git.openjdk.org/jdk/commit/c173d416f749348bee42e1a9436a999700d0f0e8 Stats: 19 lines in 2 files changed: 6 ins; 0 del; 13 mod 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster Reviewed-by: eastigeevich, phh ------------- PR: https://git.openjdk.org/jdk/pull/27726 From azafari at openjdk.org Thu Nov 6 13:06:41 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 6 Nov 2025 13:06:41 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v5] In-Reply-To: References: Message-ID: <2tqtSNmhY0bGDrqu06wBvUWw_bpdv311BSM-ij5EEGY=.90ef7eb5-8d88-439c-b0df-f917f3543cc2@github.com> > Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. > > Tests: > mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: review comments applied ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27288/files - new: https://git.openjdk.org/jdk/pull/27288/files/8a3d6d13..cc3831fa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=03-04 Stats: 7 lines in 1 file changed: 3 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/27288.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27288/head:pull/27288 PR: https://git.openjdk.org/jdk/pull/27288 From eosterlund at openjdk.org Thu Nov 6 13:33:04 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 6 Nov 2025 13:33:04 GMT Subject: RFR: 8371346: ZGC: Flexible heap base selection In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 19:19:23 GMT, Axel Boldt-Christmas wrote: > ZGC reserves a virtual address range for its heap with one high order bit set which is referred to as the heap base. Internally we then often represent heap addresses as offset from this heap base. > > Currently we select one specific heap base at the start based on MaxHeapSize and the current system properties. > > With instrumented builds, or custom launchers it may be that we are unable to reserve a usable address range using that heap base. Currently we just give up if this happens and exits the VM. > > This is problematic when using instrumented builds such as ASAN where there are certain address ranges it uses which often clash with the default ZGC heap base. > > I propose that we are more flexible when selecting the heap base, and we start as we do today at our preferred location, but are able to retry other compatible heap bases within some broader limits. > > The implementation will now start at the recommended or required heap base which ever is larger and try to first reserve the desired reservation size (normally 16 * MaxHeapSize). If no heap base can accommodate this desired size, it will attempt to find at least the required size and use that. > > On linux x86_64 we will now also probe for the heap base rather than hard coding the max heap base as we did previously. This is beneficial when there are address space restrictions (such as with ASAN), and when there are none, we only do a couple of extra system calls at most. > > There are some changes to the gc+init logging. The ZAddressOffsetMax is adjusted to always be a correct upper bound. And the exit path when reservation fails is clean up, so that we exit early when we know that the external virtual memory limits will prohibit the heap reservation. > > Performance testing show no significant differences. > > Testing: > * GHA > * Running ZGC tier1-8 on Oracle supported platforms Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28161#pullrequestreview-3428202976 From duke at openjdk.org Thu Nov 6 13:43:33 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 6 Nov 2025 13:43:33 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v9] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: remove C2AccessValuePtr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/6d122039..e89910c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=07-08 Stats: 58 lines in 8 files changed: 0 ins; 21 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From duke at openjdk.org Thu Nov 6 13:58:53 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 6 Nov 2025 13:58:53 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v10] In-Reply-To: References: Message-ID: <1zyQq98OPsZ-2nzYz21X_5v2RgKhWaZrZaJQevDMzo4=.138599b1-4797-42b0-a48a-829a112dfbe7@github.com> > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - fix conflict - Merge master - remove C2AccessValuePtr - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Fix build - ... and 2 more: https://git.openjdk.org/jdk/compare/c173d416...36e024db ------------- Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=09 Stats: 230 lines in 18 files changed: 33 ins; 55 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From mablakatov at openjdk.org Thu Nov 6 17:59:41 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 6 Nov 2025 17:59:41 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: > Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. > > This has passed tier1-3 and jcstress testing on AArch64. Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 - the only trampoline in ArrayCopyStub is never shared - fixup: a shared trampoline must branch to a statically bound method - share static call trampolines generated by C1 as well - assert callee is nullptr for runtime calls - assert that call sites offsets aren't missing - cleanup: rephrase comments in macroAssembler_aarch64.hpp - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' - remove implementation-dependent logic from emit_shared_trampolines() - ... and 8 more: https://git.openjdk.org/jdk/compare/f6f87bb6...871903f4 ------------- Changes: https://git.openjdk.org/jdk/pull/25954/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25954&range=07 Stats: 447 lines in 12 files changed: 320 ins; 114 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/25954.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25954/head:pull/25954 PR: https://git.openjdk.org/jdk/pull/25954 From mablakatov at openjdk.org Thu Nov 6 17:59:43 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 6 Nov 2025 17:59:43 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v7] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Thu, 30 Oct 2025 13:11:27 GMT, Evgeny Astigeevich wrote: >> Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: >> >> - share static call trampolines generated by C1 as well >> - assert callee is nullptr for runtime calls >> - assert that call sites offsets aren't missing >> - cleanup: rephrase comments in macroAssembler_aarch64.hpp >> - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 >> - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' >> - remove implementation-dependent logic from emit_shared_trampolines() >> - cleanup: update copyright headers >> - Make the value type of the dictionary a struct instead of Pair typedef >> - Remove share_rc_trampoline_for and share_sc_trampoline_for >> - ... and 5 more: https://git.openjdk.org/jdk/compare/fd296774...e3ad440b > > src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp line 336: > >> 334: Address resolve(SharedRuntime::get_resolve_static_call_stub(), >> 335: relocInfo::static_call_type); >> 336: address call = __ trampoline_call(resolve, info()->method()); > > This does not save any space because it is one call in the stub. reverted, please see https://github.com/openjdk/jdk/pull/25954/commits/a5c665520088ca7a3f282b684bb58701733af83e ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2500160541 From mablakatov at openjdk.org Thu Nov 6 17:59:45 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 6 Nov 2025 17:59:45 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v7] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Thu, 30 Oct 2025 17:38:19 GMT, Evgeny Astigeevich wrote: >> Would it be correct to use `op->method()->can_be_statically_bound()` instead, similarly to how it's done for static call stubs in https://github.com/openjdk/jdk/blob/ed36b9bb6f3d429db6accfb3b096e50e7f2217ff/src/hotspot/share/c1/c1_LIRAssembler.cpp#L456? > > `can_be_statically_bound()` should definitely work. Fixed, please see https://github.com/openjdk/jdk/pull/25954/commits/ba317d4ca96af87fbc53155d1fcc33d88a6c2349 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2500161764 From mablakatov at openjdk.org Thu Nov 6 17:59:46 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 6 Nov 2025 17:59:46 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v6] In-Reply-To: <2pGRA1xI61iqh51Hi1HkJCT7uDR1j0i4r__ceZQGsYk=.903836e5-d3b3-4481-b48b-f1f3c4b0c148@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> <9idT2wdo-uuG1seABSR_6Mr0S_ygFBmeAbL6hK2QwCg=.343f6868-486a-4407-ab43-f6236a504e37@github.com> <8_LF342KuV09coGe-_uebB5ZjIyS9GLzJ9u6Xac8o-I=.69ab3c54-dd12-40d6-9ff3-123615ba91ac@github.com> <2pGRA1xI61iqh51Hi1HkJCT7uDR1j0i4r__ceZQGsYk=.903836e5-d3b3-4481-b48b-f1f3c4b0c148@github.com> Message-ID: <0lC1c-0kf6Ea-_a4j-cjxD4LFvwLLFZtmAIwWxATwPA=.a196f9fd-27b9-463e-bd13-6953b49af174@github.com> On Thu, 30 Oct 2025 15:38:54 GMT, Mikhail Ablakatov wrote: >> Are you sure this is OK in LIR_Assembler::call()? Calls of type lir_dynamic_call use relocInfo::static_call_type, but can't they resolve to different targets depending on the call site info? I don't think op->method() is unique for invokedynamic calls. @iwanowww what do you think? > > Perhaps we should check if the callee method can be statically bound here using `op->method()->can_be_statically_bound()`? > Are you sure this is OK in LIR_Assembler::call()? Fixed, please see https://github.com/openjdk/jdk/pull/25954/commits/ba317d4ca96af87fbc53155d1fcc33d88a6c2349 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2500165019 From alanb at openjdk.org Thu Nov 6 19:17:15 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 6 Nov 2025 19:17:15 GMT Subject: RFR: 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster [v2] In-Reply-To: References: <9KmPqhIXB8_05Tu4NQP7TECtTqtQKs0ZvXWxI58C1TU=.5c488724-4d57-4d4f-aa2e-bdecab6a91af@github.com> Message-ID: On Thu, 30 Oct 2025 11:29:40 GMT, Boris Ulasevich wrote: >> This change adjusts the default selection of SHA-3 intrinsics on AArch64 based on observed performance across CPUs. In our measurements, the SHA-3 SIMD path (using SHA3 instructions) is consistently faster on Apple silicon, while on Neoverse and several older cores the GPR implementation performs better. On CPUs without SHA-3 instructions, the GPR path is the only viable option and behaves as expected. >> >> Accordingly, `UseSIMDForSHA3Intrinsic` now defaults to false globally. The SIMD variant is auto-enabled only on Apple silicon; elsewhere the default remains the GPR path. >> >> _The attached raw data also includes observations about `UseFPUForSpilling`. Back in #27350 we discussed whether the option is entirely useless. While orthogonal to this change, the MessageDigests benchmark is a convenient probe of register-spilling behavior because the SHA-3 (Keccak) algorithm is highly register-hungry, which adds a significant number of spills to the generated assembly sequence. In the provided results, at least one CPU benefits from enabling UseFPUForSpilling, so the option seems worth keeping for now._ >> >> **Cortex-A53 (RPi3)** >> >> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest >> Benchmark (digesterName) (length) Cnt Score Error Units >> MessageDigests.digest SHA3-512 64 150 345.010 ? 0.473 ops/ms >> MessageDigests.digest SHA3-512 16384 150 1.817 ? 0.001 ops/ms >> >> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest >> MessageDigests.digest SHA3-512 64 150 352.247 ? 0.279 ops/ms +UseFPUForSpilling: +2% >> MessageDigests.digest SHA3-512 16384 150 1.855 ? 0.001 ops/ms +UseFPUForSpilling: +2% >> >> $ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5 >> Benchmark (digesterName) (length) Cnt Score Error Units >> MessageDigests.digest SHA3-512 64 15 345.552 ? 0.291 ops/ms >> MessageDigests.digest SHA3-512 16384 15 1.818 ? 0.001 ops/ms >> MessageDigests.getAndDigest SHA3-512 64 15 265.744 ? 56.591 ops/ms >> MessageD... > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > minor review corrections There are 5 tests failing in tier2 on aarch64 that I assume are related to this change : [JDK-8371432](https://bugs.openjdk.org/browse/JDK-8371432) ------------- PR Comment: https://git.openjdk.org/jdk/pull/27726#issuecomment-3499007823 From jkratochvil at openjdk.org Thu Nov 6 19:37:49 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Thu, 6 Nov 2025 19:37:49 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v4] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Use padding fields for ResolvedFieldEntry and ResolvedMethodEntry - Merge branch 'master' into clangmemset - Merge branch 'master' into clangmemset - Revert "8361288: Fix build of JTReg: wget exited with exit code 4" This reverts commit 6e6b8f6a26f8e555f1e70544546b92bbafcae6cc. - 8361288: Fix build of JTReg: wget exited with exit code 4 - 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/3745a8af..3d4eccc1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=02-03 Stats: 14222 lines in 372 files changed: 8523 ins; 4457 del; 1242 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From jkratochvil at openjdk.org Thu Nov 6 19:47:40 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Thu, 6 Nov 2025 19:47:40 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v5] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Import some code from Ioi Lam's patch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/3d4eccc1..98bb03eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=03-04 Stats: 89 lines in 2 files changed: 6 ins; 77 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From jkratochvil at openjdk.org Thu Nov 6 19:52:17 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Thu, 6 Nov 2025 19:52:17 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v6] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: More code improvements from Ioi Lam's patch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/98bb03eb..77814bf9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=04-05 Stats: 48 lines in 4 files changed: 24 ins; 22 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From lmesnik at openjdk.org Thu Nov 6 21:33:38 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 6 Nov 2025 21:33:38 GMT Subject: RFR: 8371103: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java failing Message-ID: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> The problem happens because jvmti events are posted while handling JvmtiThreadState_lock. The fix just to move flushing out of lock like it is already done in `JvmtiEventController::set_user_enabled(..)` method. The problem start reproducing after fix for https://bugs.openjdk.org/browse/JDK-8370732 that replaced GC triggering from slow and unreliable `ClassUnloader.eatMemory();` to fast and robust`WhiteBox.fullGC()`. The jvmti events posting is not synchronized with enabling/disabling events and setting callbacks. So even if there are new events appear in the jvmti tagmap after flushing it is not a bug to don't post them or use new callback handler. Also, it might be makes sense to flush object events before vm_death and post all deferred events from SerrviceThread queue. I am going to file separate RFE for this. Also, I am going to file RFE to replace all GC provoking the `eatMemory()` calls with `WB.fullGC()` to improve test stability and reduce test execution time. ------------- Commit messages: - 8371103: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java failing Changes: https://git.openjdk.org/jdk/pull/28184/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28184&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371103 Stats: 8 lines in 2 files changed: 4 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28184.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28184/head:pull/28184 PR: https://git.openjdk.org/jdk/pull/28184 From amenkov at openjdk.org Thu Nov 6 21:50:04 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 6 Nov 2025 21:50:04 GMT Subject: RFR: 8371103: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java failing In-Reply-To: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> References: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> Message-ID: On Thu, 6 Nov 2025 21:26:24 GMT, Leonid Mesnik wrote: > The problem happens because jvmti events are posted while handling JvmtiThreadState_lock. The fix just to move > flushing out of lock like it is already done in `JvmtiEventController::set_user_enabled(..)` method. > > The problem start reproducing after fix for https://bugs.openjdk.org/browse/JDK-8370732 that replaced GC triggering from slow and unreliable `ClassUnloader.eatMemory();` to fast and robust`WhiteBox.fullGC()`. > > The jvmti events posting is not synchronized with enabling/disabling events and setting callbacks. So even if there are new events appear in the jvmti tagmap after flushing it is not a bug to don't post them or use new callback handler. > > Also, it might be makes sense to flush object events before vm_death and post all deferred events from SerrviceThread queue. > I am going to file separate RFE for this. > Also, I am going to file RFE to replace all GC provoking the `eatMemory()` calls with `WB.fullGC()` to improve test stability and reduce test execution time. Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28184#pullrequestreview-3430693825 From sspitsyn at openjdk.org Thu Nov 6 22:31:02 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 6 Nov 2025 22:31:02 GMT Subject: RFR: 8371103: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java failing In-Reply-To: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> References: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> Message-ID: <_-7XKb0c-O5l0FmUztsbDj2NAAXeHhANS25nZFviFHg=.fd1ac36a-9450-4121-8a05-88a4e3593753@github.com> On Thu, 6 Nov 2025 21:26:24 GMT, Leonid Mesnik wrote: > The problem happens because jvmti events are posted while handling JvmtiThreadState_lock. The fix just to move > flushing out of lock like it is already done in `JvmtiEventController::set_user_enabled(..)` method. > > The problem start reproducing after fix for https://bugs.openjdk.org/browse/JDK-8370732 that replaced GC triggering from slow and unreliable `ClassUnloader.eatMemory();` to fast and robust`WhiteBox.fullGC()`. > > The jvmti events posting is not synchronized with enabling/disabling events and setting callbacks. So even if there are new events appear in the jvmti tagmap after flushing it is not a bug to don't post them or use new callback handler. > > Also, it might be makes sense to flush object events before vm_death and post all deferred events from SerrviceThread queue. > I am going to file separate RFE for this. > Also, I am going to file RFE to replace all GC provoking the `eatMemory()` calls with `WB.fullGC()` to improve test stability and reduce test execution time. looks good. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28184#pullrequestreview-3430797067 From jkratochvil at openjdk.org Thu Nov 6 22:44:40 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Thu, 6 Nov 2025 22:44:40 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Fix 32-bit compilation error ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/77814bf9..b789941f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=05-06 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From dlong at openjdk.org Fri Nov 7 00:01:29 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 7 Nov 2025 00:01:29 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Tue, 4 Nov 2025 09:48:20 GMT, Ruben wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField > - Merge from the main branch > - Address review comments and fix a mistype > - Check for NOP and MOVK separately in NativePostCallNop > - Test for deoptimization in virtual threads > > Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a > - Merge from the main branch > - Address review comments > - Address review comments > - Address review comments > - The patch is contributed by @TheRealMDoerr > - ... and 5 more: https://git.openjdk.org/jdk/compare/1922c4fd...359c2f18 We are seeing some new crashes ([JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388)) trying to access a PC that is just past the end of the nmethod and the page is unmapped because it also happens to be the last page of the CodeHeap. Could it be related to the changes in this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3499890263 From duke at openjdk.org Fri Nov 7 00:20:19 2025 From: duke at openjdk.org (Ruben) Date: Fri, 7 Nov 2025 00:20:19 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Thu, 6 Nov 2025 23:58:46 GMT, Dean Long wrote: > We are seeing some new crashes ([JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388)) trying to access a PC that is just past the end of the nmethod and the page is unmapped because it also happens to be the last page of the CodeHeap. Could it be related to the changes in this PR? Yes, I think it could be similar to the case fixed for AArch64 post-call NOP check earlier: https://github.com/openjdk/jdk/blob/e34a831814996be3e0a2df86b11b1718a76ea558/src/hotspot/cpu/x86/nativeInst_x86.hpp#L584 reads a 32-bit integer from the perceived call site. In case of the deoptimization handler, which is potentially located at the end of the code blob, the read would happen past the end of the code blob - which might cause the access to an unmapped page. It could be replaced with the two-step comparison: first the comparison matching size of the `jmp` instruction (I believe that's 2 bytes), and if that's successful then comparison of the third byte as the second step. Alternatively, the specific deoptimization stub code could be extended by a `nop` in the `emit_deopt_handler`. Would either of these options be suitable? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3499929974 From duke at openjdk.org Fri Nov 7 00:50:27 2025 From: duke at openjdk.org (Ruben) Date: Fri, 7 Nov 2025 00:50:27 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Tue, 4 Nov 2025 09:48:20 GMT, Ruben wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField > - Merge from the main branch > - Address review comments and fix a mistype > - Check for NOP and MOVK separately in NativePostCallNop > - Test for deoptimization in virtual threads > > Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a > - Merge from the main branch > - Address review comments > - Address review comments > - Address review comments > - The patch is contributed by @TheRealMDoerr > - ... and 5 more: https://git.openjdk.org/jdk/compare/1922c4fd...359c2f18 Indeed, the `jmp` size is `2` - I had incorrectly assumed it is `5` as specified here https://github.com/openjdk/jdk/blob/e34a831814996be3e0a2df86b11b1718a76ea558/src/hotspot/cpu/x86/nativeInst_x86.hpp#L412 however that's for a different case. The `10` as size of the deopt handler stub code at https://github.com/openjdk/jdk/blob/e34a831814996be3e0a2df86b11b1718a76ea558/src/hotspot/cpu/x86/x86.ad#L2774 is not correct either - it should be `7`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3499993366 From iklam at openjdk.org Fri Nov 7 04:44:02 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Nov 2025 04:44:02 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 22:44:40 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Fix 32-bit compilation error Thanks for the update. I will run it through our CI to see if the deterministic CDS tests pass. I think we should replace the memset with the following, per @kimbarrett 's comment on [JBS](https://bugs.openjdk.org/browse/JDK-8357579?focusedId=14831517&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14831517) ::new (this) ResolvedFieldEntry(); I tried it on Linux and it worked. I will test on all platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26098#issuecomment-3500680077 From duke at openjdk.org Fri Nov 7 04:57:52 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 04:57:52 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v7] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: consolidate unit tests for vm memory tagging ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/183927b0..59f4722c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=05-06 Stats: 53 lines in 3 files changed: 20 ins; 32 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Fri Nov 7 04:57:53 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 04:57:53 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v6] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: <-_XVImRDDvjsMvf_7YNG_V-dFYfNqtdFC-xVrylQpPI=.8738f2f9-454e-4114-b7da-7c51667e580e@github.com> On Wed, 5 Nov 2025 14:43:39 GMT, Stefan Karlsson wrote: > > I think the changes are minor to assert the tagging on allocation while we are doing that and do not require extra tests, please let me know if you still think otherwise. > > I think otherwise. Please put the test somewhere else. Done ------------- PR Comment: https://git.openjdk.org/jdk/pull/27868#issuecomment-3500704669 From duke at openjdk.org Fri Nov 7 05:02:22 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 05:02:22 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v8] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: remote testutils.hpp from test_zForwarding.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/59f4722c..0fbcdb09 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From jsjolen at openjdk.org Fri Nov 7 05:02:23 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 7 Nov 2025 05:02:23 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v7] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Fri, 7 Nov 2025 04:57:52 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > consolidate unit tests for vm memory tagging Changes requested by jsjolen (Reviewer). test/hotspot/gtest/runtime/test_os.cpp line 1133: > 1131: char* base = os::reserve_memory(size, mtTest, false); > 1132: ASSERT_NOT_NULL(base); > 1133: Remove this change ------------- PR Review: https://git.openjdk.org/jdk/pull/27868#pullrequestreview-3431645714 PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2501685500 From iklam at openjdk.org Fri Nov 7 05:04:09 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Nov 2025 05:04:09 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 22:44:40 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Fix 32-bit compilation error I think adding some manual paddings might be OK. I am hoping the following will work on all reasonable C++ compilers, so we don't need to hard code any size values. STATIC_ASSERT(sizeof(ResolvedMethodEntryWithExtra) > sizeof(ResolvedMethodEntry)); In the worst case, we may have to add extra paddings for weird compilers #ifdef _SOME_COMPILER u8 _more_paddings; #endif There should be no performance impact as the C++ compiler should be smart enough to combine the init/copy operations of the padding with the trailing "real" fields. E.g., the following can be compiled to a single 64-bit move: _get_code(0), _put_code(0) #ifdef _LP64 , _padding(0) #endif There's a trade off with my other proposal, https://github.com/openjdk/jdk/pull/28172, which is arguably more portable, but it's more verbose as you need to copy each field by hand, so it's less maintainable. I will be fine with either approach, although if we can find a clean *and* portable solution that would be best, but we shouldn't lose our mind doing it :-) src/hotspot/share/oops/resolvedMethodEntry.cpp line 43: > 41: STATIC_ASSERT(sizeof(ResolvedMethodEntry) == 16); > 42: # endif > 43: #endif I think this can be cleaned up as: #ifdef _LP64 STATIC_ASSERT(sizeof(ResolvedMethodEntry) == DEBUG_ONLY(32) NOT_DEBUG(24)); #else STATIC_ASSERT(sizeof(ResolvedMethodEntry) == DEBUG_ONLY(20) NOT_DEBUG(16)); #endif But I think this will be better without the need to hard code numbers: // There should be no more padding at the end of ResolvedMethodEntry class ResolvedMethodEntryWithExtra : public ResolvedMethodEntry { u1 _extra_field; }; STATIC_ASSERT(sizeof(ResolvedMethodEntryWithExtra) > sizeof(ResolvedMethodEntry)); I tested by changing `u4 _padding2` to `u4 _padding2` in `ResolvedMethodEntry` and the static assert fails. ------------- PR Review: https://git.openjdk.org/jdk/pull/26098#pullrequestreview-3431620301 PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2501670635 From duke at openjdk.org Fri Nov 7 05:08:20 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 05:08:20 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v9] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: remove blanck line ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/0fbcdb09..d8d09007 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Fri Nov 7 05:22:10 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 7 Nov 2025 05:22:10 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v9] In-Reply-To: <2yddz-oh3R_8sZlR6zagp-aoev24wvWTGw_7yFCL1yo=.9ad83098-76c8-41d5-9645-63335cb0b2a2@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> <2yddz-oh3R_8sZlR6zagp-aoev24wvWTGw_7yFCL1yo=.9ad83098-76c8-41d5-9645-63335cb0b2a2@github.com> Message-ID: On Wed, 29 Oct 2025 17:41:35 GMT, Evgeny Astigeevich wrote: > The PR needs gtest/jtreg testing results. Also it needs to be checks the new code is covered by existing gtests or new tests are needed. Done ------------- PR Comment: https://git.openjdk.org/jdk/pull/27868#issuecomment-3500768402 From kbarrett at openjdk.org Fri Nov 7 06:37:02 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 7 Nov 2025 06:37:02 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v5] In-Reply-To: <2tqtSNmhY0bGDrqu06wBvUWw_bpdv311BSM-ij5EEGY=.90ef7eb5-8d88-439c-b0df-f917f3543cc2@github.com> References: <2tqtSNmhY0bGDrqu06wBvUWw_bpdv311BSM-ij5EEGY=.90ef7eb5-8d88-439c-b0df-f917f3543cc2@github.com> Message-ID: On Thu, 6 Nov 2025 13:06:41 GMT, Afshin Zafari wrote: >> Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. >> >> Tests: >> mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > review comments applied I have no further comments or issues. Since it's still essentially the implementation I proposed, I'll leave it to others to approve. ------------- PR Review: https://git.openjdk.org/jdk/pull/27288#pullrequestreview-3431878809 From thartmann at openjdk.org Fri Nov 7 07:43:41 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 07:43:41 GMT Subject: RFR: 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 Message-ID: Clean backout of [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) due to massive failures with different tests in our CI (see JBS for details). Thanks, Tobias ------------- Commit messages: - Revert "8365047: Remove exception handler stub code in C2" Changes: https://git.openjdk.org/jdk/pull/28187/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28187&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371388 Stats: 569 lines in 41 files changed: 216 ins; 268 del; 85 mod Patch: https://git.openjdk.org/jdk/pull/28187.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28187/head:pull/28187 PR: https://git.openjdk.org/jdk/pull/28187 From thartmann at openjdk.org Fri Nov 7 07:45:28 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 07:45:28 GMT Subject: RFR: 8365047: Remove exception handler stub code in C2 [v10] In-Reply-To: References: <4R-933Vw15ku03kFonQY4msXdmzkNVnawVWFB7Uu4k0=.1653f2ec-79d1-4076-aa03-4bae566d74c2@github.com> Message-ID: On Tue, 4 Nov 2025 09:48:20 GMT, Ruben wrote: >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField > - Merge from the main branch > - Address review comments and fix a mistype > - Check for NOP and MOVK separately in NativePostCallNop > - Test for deoptimization in virtual threads > > Change-Id: I9ef51b426d34e9b44a3891f6a45307232f900e5a > - Merge from the main branch > - Address review comments > - Address review comments > - Address review comments > - The patch is contributed by @TheRealMDoerr > - ... and 5 more: https://git.openjdk.org/jdk/compare/1922c4fd...359c2f18 Backing out with https://github.com/openjdk/jdk/pull/28187. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26678#issuecomment-3501135218 From chagedorn at openjdk.org Fri Nov 7 07:54:00 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Fri, 7 Nov 2025 07:54:00 GMT Subject: RFR: 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: <12t7qgPeVTYCh-u65W5b9CG7deELNKNQj_3jkcy9fbw=.3cb3afb1-4c99-4aea-9287-24dfb3ff5d2a@github.com> On Fri, 7 Nov 2025 07:36:16 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) due to massive failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Looks good and trivial. ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28187#pullrequestreview-3432114139 From epeter at openjdk.org Fri Nov 7 07:57:04 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 7 Nov 2025 07:57:04 GMT Subject: RFR: 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 07:36:16 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) due to massive failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Looks good to me, thanks for taking care of this! ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28187#pullrequestreview-3432123803 From epeter at openjdk.org Fri Nov 7 08:01:13 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 7 Nov 2025 08:01:13 GMT Subject: RFR: 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster [v2] In-Reply-To: References: <9KmPqhIXB8_05Tu4NQP7TECtTqtQKs0ZvXWxI58C1TU=.5c488724-4d57-4d4f-aa2e-bdecab6a91af@github.com> Message-ID: On Thu, 30 Oct 2025 11:29:40 GMT, Boris Ulasevich wrote: >> This change adjusts the default selection of SHA-3 intrinsics on AArch64 based on observed performance across CPUs. In our measurements, the SHA-3 SIMD path (using SHA3 instructions) is consistently faster on Apple silicon, while on Neoverse and several older cores the GPR implementation performs better. On CPUs without SHA-3 instructions, the GPR path is the only viable option and behaves as expected. >> >> Accordingly, `UseSIMDForSHA3Intrinsic` now defaults to false globally. The SIMD variant is auto-enabled only on Apple silicon; elsewhere the default remains the GPR path. >> >> _The attached raw data also includes observations about `UseFPUForSpilling`. Back in #27350 we discussed whether the option is entirely useless. While orthogonal to this change, the MessageDigests benchmark is a convenient probe of register-spilling behavior because the SHA-3 (Keccak) algorithm is highly register-hungry, which adds a significant number of spills to the generated assembly sequence. In the provided results, at least one CPU benefits from enabling UseFPUForSpilling, so the option seems worth keeping for now._ >> >> **Cortex-A53 (RPi3)** >> >> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest >> Benchmark (digesterName) (length) Cnt Score Error Units >> MessageDigests.digest SHA3-512 64 150 345.010 ? 0.473 ops/ms >> MessageDigests.digest SHA3-512 16384 150 1.817 ? 0.001 ops/ms >> >> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest >> MessageDigests.digest SHA3-512 64 150 352.247 ? 0.279 ops/ms +UseFPUForSpilling: +2% >> MessageDigests.digest SHA3-512 16384 150 1.855 ? 0.001 ops/ms +UseFPUForSpilling: +2% >> >> $ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5 >> Benchmark (digesterName) (length) Cnt Score Error Units >> MessageDigests.digest SHA3-512 64 15 345.552 ? 0.291 ops/ms >> MessageDigests.digest SHA3-512 16384 15 1.818 ? 0.001 ops/ms >> MessageDigests.getAndDigest SHA3-512 64 15 265.744 ? 56.591 ops/ms >> MessageD... > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > minor review corrections This change is backed out by https://github.com/openjdk/jdk/pull/28189 ------------- PR Comment: https://git.openjdk.org/jdk/pull/27726#issuecomment-3501174362 From thartmann at openjdk.org Fri Nov 7 08:03:17 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 08:03:17 GMT Subject: RFR: 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster Message-ID: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> Clean backout of [JDK-8359256](https://bugs.openjdk.org/browse/JDK-8359256) due to failures with different tests in our CI (see JBS for details). Thanks, Tobias ------------- Commit messages: - Revert "8359256: AArch64: Use SHA3 GPR intrinsic where it's faster" Changes: https://git.openjdk.org/jdk/pull/28189/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28189&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371432 Stats: 19 lines in 2 files changed: 0 ins; 6 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/28189.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28189/head:pull/28189 PR: https://git.openjdk.org/jdk/pull/28189 From mchevalier at openjdk.org Fri Nov 7 08:03:18 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Fri, 7 Nov 2025 08:03:18 GMT Subject: RFR: 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster In-Reply-To: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> References: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> Message-ID: On Fri, 7 Nov 2025 07:55:24 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8359256](https://bugs.openjdk.org/browse/JDK-8359256) due to failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Marked as reviewed by mchevalier (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28189#pullrequestreview-3432129457 From epeter at openjdk.org Fri Nov 7 08:03:19 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 7 Nov 2025 08:03:19 GMT Subject: RFR: 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster In-Reply-To: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> References: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> Message-ID: On Fri, 7 Nov 2025 07:55:24 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8359256](https://bugs.openjdk.org/browse/JDK-8359256) due to failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Looks good to me. Thanks for taking care of this! ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28189#pullrequestreview-3432131157 From thartmann at openjdk.org Fri Nov 7 08:08:01 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 08:08:01 GMT Subject: RFR: 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 07:36:16 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) due to massive failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Thanks for the quick reviews, Christian and Emanuel. Running sanity testing before integration. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28187#issuecomment-3501194557 From syan at openjdk.org Fri Nov 7 08:10:02 2025 From: syan at openjdk.org (SendaoYan) Date: Fri, 7 Nov 2025 08:10:02 GMT Subject: RFR: 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster In-Reply-To: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> References: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> Message-ID: <0j1qe0pLv6w1Cs2SL4vjIVEx-q4RVJ91LrxdnPJe1S4=.a0114d3f-a6a5-45ab-825e-d0e34c341f94@github.com> On Fri, 7 Nov 2025 07:55:24 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8359256](https://bugs.openjdk.org/browse/JDK-8359256) due to failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Marked as reviewed by syan (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28189#pullrequestreview-3432159197 From thartmann at openjdk.org Fri Nov 7 08:10:03 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 08:10:03 GMT Subject: RFR: 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster In-Reply-To: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> References: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> Message-ID: On Fri, 7 Nov 2025 07:55:24 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8359256](https://bugs.openjdk.org/browse/JDK-8359256) due to failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Thanks for the quick reviews, Marc and Emanuel. Running sanity testing before integration. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28189#issuecomment-3501195780 From jsikstro at openjdk.org Fri Nov 7 08:49:02 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Fri, 7 Nov 2025 08:49:02 GMT Subject: RFR: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine In-Reply-To: <6hEjGUOSdQZLd59jx1J-HI5n0-54TlN469u5_2F5HBU=.00ea17df-a97f-457f-8040-fc2392d96075@github.com> References: <6hEjGUOSdQZLd59jx1J-HI5n0-54TlN469u5_2F5HBU=.00ea17df-a97f-457f-8040-fc2392d96075@github.com> Message-ID: <64m1-OK62CN2it4vnJ6j0hzQSFU7JoQv_zTL3W67MDw=.775a407e-5dee-4eec-b721-85ca495939c3@github.com> On Wed, 5 Nov 2025 15:49:58 GMT, Vladimir Kozlov wrote: > @jsikstro do we have any test which use these flags? Yes, the following test use one or both of the `AlwaysActAsServerClassMachine` and `NeverActAsServerClassMachine` flags. I've run them all locally to see that they are not affected by the deprecation warning that is printed when using the flag after this patch. test/hotspot/jtreg/gc/arguments/TestSelectDefaultGC.java test/jdk/jdk/jfr/event/compiler/TestCompilerPhase.java test/hotspot/gtest/runtime/test_globals.cpp I didn't mention in the PR summary, but I've also run tier1-2 for sanity. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28148#issuecomment-3501335945 From duke at openjdk.org Fri Nov 7 09:03:03 2025 From: duke at openjdk.org (Ruben) Date: Fri, 7 Nov 2025 09:03:03 GMT Subject: RFR: 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 07:36:16 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) due to massive failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias I have a candidate patch that should address the reported failure from JDK-8371388. If preferred, I can open a PR with this fix as an alternative to reverting the original change. Please let me know. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28187#issuecomment-3501379345 From phubner at openjdk.org Fri Nov 7 09:16:32 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Fri, 7 Nov 2025 09:16:32 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage Message-ID: Hi all, The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. We observed the above in Valhalla and already patched it there. Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). ------------- Commit messages: - Print oop without klass asserts. Changes: https://git.openjdk.org/jdk/pull/28190/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28190&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371216 Stats: 9 lines in 3 files changed: 8 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28190.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28190/head:pull/28190 PR: https://git.openjdk.org/jdk/pull/28190 From thartmann at openjdk.org Fri Nov 7 09:20:17 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 09:20:17 GMT Subject: RFR: 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 09:00:48 GMT, Ruben wrote: >> Clean backout of [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) due to massive failures with different tests in our CI (see JBS for details). >> >> Thanks, >> Tobias > > I have a candidate patch that should address the reported failure from JDK-8371388. If preferred, I can open a PR with this fix as an alternative to reverting the original change. Please let me know. Thanks @ruben-arm. Let's go with this clean back out for now - testing just came back clean. Once you have the REDO ready, I can submit some more sophisticated testing to make sure it addresses the issues we're seeing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28187#issuecomment-3501435282 From thartmann at openjdk.org Fri Nov 7 09:20:18 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 09:20:18 GMT Subject: Integrated: 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: <1naK9sqlndcBD4rKhJm8zNEn1dMr-vwWOo4Qv4M2cxg=.17d0ade7-6c5f-4453-9fbf-cea4372ad1f8@github.com> On Fri, 7 Nov 2025 07:36:16 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) due to massive failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias This pull request has now been integrated. Changeset: 48bbc950 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/48bbc950f11113a57ea03f877bc3e526982c0eef Stats: 569 lines in 41 files changed: 216 ins; 268 del; 85 mod 8371388: [BACKOUT] JDK-8365047: Remove exception handler stub code in C2 Reviewed-by: chagedorn, epeter ------------- PR: https://git.openjdk.org/jdk/pull/28187 From thartmann at openjdk.org Fri Nov 7 09:22:13 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 09:22:13 GMT Subject: Integrated: 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster In-Reply-To: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> References: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> Message-ID: On Fri, 7 Nov 2025 07:55:24 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8359256](https://bugs.openjdk.org/browse/JDK-8359256) due to failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias This pull request has now been integrated. Changeset: 3d6824e8 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/3d6824e802bda6efed40f7613eda7c8c0d84e673 Stats: 19 lines in 2 files changed: 0 ins; 6 del; 13 mod 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster Reviewed-by: mchevalier, epeter, syan ------------- PR: https://git.openjdk.org/jdk/pull/28189 From stuefe at openjdk.org Fri Nov 7 10:07:12 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Nov 2025 10:07:12 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 09:08:33 GMT, Paul H?bner wrote: > Hi all, > > The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. > > For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. > > We observed the above in Valhalla and already patched it there. > > Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). I would not add the helper function for only one case. We already have a bunch of helpers like these. Just test for oop != NULL at the call site. ------------- PR Review: https://git.openjdk.org/jdk/pull/28190#pullrequestreview-3432741260 From eosterlund at openjdk.org Fri Nov 7 10:11:08 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 7 Nov 2025 10:11:08 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v16] In-Reply-To: References: Message-ID: > This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. > > The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. > > This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. > > The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. > > The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. > > Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. > Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the order they are laid out in... Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Comment update - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - remove include - Interned string value word accounting - Dont load all objects when JVMTI CFLH is on - Remove duplicate string dedup disabling when dumping - Accept interned strings sharing value with another string - ... and 22 more: https://git.openjdk.org/jdk/compare/e34a8318...7e3dee89 ------------- Changes: https://git.openjdk.org/jdk/pull/27732/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27732&range=15 Stats: 8722 lines in 106 files changed: 5941 ins; 2318 del; 463 mod Patch: https://git.openjdk.org/jdk/pull/27732.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27732/head:pull/27732 PR: https://git.openjdk.org/jdk/pull/27732 From jsjolen at openjdk.org Fri Nov 7 10:17:30 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 7 Nov 2025 10:17:30 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v5] In-Reply-To: <2tqtSNmhY0bGDrqu06wBvUWw_bpdv311BSM-ij5EEGY=.90ef7eb5-8d88-439c-b0df-f917f3543cc2@github.com> References: <2tqtSNmhY0bGDrqu06wBvUWw_bpdv311BSM-ij5EEGY=.90ef7eb5-8d88-439c-b0df-f917f3543cc2@github.com> Message-ID: On Thu, 6 Nov 2025 13:06:41 GMT, Afshin Zafari wrote: >> Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. >> >> Tests: >> mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > review comments applied Seems fine, and I did learn a new bit hack. src/hotspot/share/oops/klass.hpp line 524: > 522: // Use well known bit hack to isolate the low bit of candidates. > 523: // The usual form is (x & -x), but VS warns (C4146) about unary minus of unsigned. > 524: // So use alternate form of negation to avoid warning. "So explicitly use two's complement to avoid warning" ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27288#pullrequestreview-3432805526 PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2502572890 From azafari at openjdk.org Fri Nov 7 11:07:52 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 7 Nov 2025 11:07:52 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v6] In-Reply-To: References: Message-ID: > Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. > > Tests: > mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: comment replaced. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27288/files - new: https://git.openjdk.org/jdk/pull/27288/files/cc3831fa..f0d4dfbd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27288.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27288/head:pull/27288 PR: https://git.openjdk.org/jdk/pull/27288 From azafari at openjdk.org Fri Nov 7 11:07:56 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 7 Nov 2025 11:07:56 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v5] In-Reply-To: References: <2tqtSNmhY0bGDrqu06wBvUWw_bpdv311BSM-ij5EEGY=.90ef7eb5-8d88-439c-b0df-f917f3543cc2@github.com> Message-ID: On Fri, 7 Nov 2025 10:14:06 GMT, Johan Sj?len wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> review comments applied > > src/hotspot/share/oops/klass.hpp line 524: > >> 522: // Use well known bit hack to isolate the low bit of candidates. >> 523: // The usual form is (x & -x), but VS warns (C4146) about unary minus of unsigned. >> 524: // So use alternate form of negation to avoid warning. > > "So explicitly use two's complement to avoid warning" Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2502823874 From aboldtch at openjdk.org Fri Nov 7 11:10:10 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 7 Nov 2025 11:10:10 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v16] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 10:11:08 GMT, Erik ?sterlund wrote: >> This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. >> >> The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. >> >> This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. >> >> The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. >> >> The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. >> >> Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. >> Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the or... > > Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: > > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - Comment update > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - remove include > - Interned string value word accounting > - Dont load all objects when JVMTI CFLH is on > - Remove duplicate string dedup disabling when dumping > - Accept interned strings sharing value with another string > - ... and 22 more: https://git.openjdk.org/jdk/compare/e34a8318...7e3dee89 Final merges looks good. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27732#pullrequestreview-3433114312 From duke at openjdk.org Fri Nov 7 11:14:58 2025 From: duke at openjdk.org (Ruben) Date: Fri, 7 Nov 2025 11:14:58 GMT Subject: RFR: 8371458: [REDO] - Remove exception handler stub code in C2 Message-ID: The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. ------------- Commit messages: - x86: Fix post-call NOP check access outside code blob - Rename deoptHandlerOffsetField -> deoptHandlerEntryOffsetField - Merge from the main branch - Address review comments and fix a mistype - Check for NOP and MOVK separately in NativePostCallNop - Test for deoptimization in virtual threads - Merge from the main branch - Address review comments - Address review comments - Address review comments - ... and 6 more: https://git.openjdk.org/jdk/compare/1922c4fd...7bb43523 Changes: https://git.openjdk.org/jdk/pull/28192/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28192&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371458 Stats: 571 lines in 42 files changed: 269 ins; 216 del; 86 mod Patch: https://git.openjdk.org/jdk/pull/28192.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28192/head:pull/28192 PR: https://git.openjdk.org/jdk/pull/28192 From duke at openjdk.org Fri Nov 7 11:19:03 2025 From: duke at openjdk.org (Ruben) Date: Fri, 7 Nov 2025 11:19:03 GMT Subject: RFR: 8371458: [REDO] - Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: <4Cyqe6dY27J8AnLfhLTyrliGYxnMdCJtAdGfJy2Ypxw=.cd37b689-667d-4e55-888b-fc4b7d440792@github.com> On Fri, 7 Nov 2025 11:07:40 GMT, Ruben wrote: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > Thanks @ruben-arm. Let's go with this clean back out for now - testing just came back clean. Once you have the REDO ready, I can submit some more sophisticated testing to make sure it addresses the issues we're seeing. @TobiHartmann, I would appreciate it if you could initiate the extended testing for this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3501950023 From bulasevich at openjdk.org Fri Nov 7 12:01:14 2025 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 7 Nov 2025 12:01:14 GMT Subject: RFR: 8371432: [BACKOUT] 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster In-Reply-To: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> References: <_UwQTwO9rCnvkMLH1hcJdkV_rYtMJUJVcQe-Z4pgHuM=.d33d04bd-46bb-4ca9-87a8-342d1f5aec59@github.com> Message-ID: On Fri, 7 Nov 2025 07:55:24 GMT, Tobias Hartmann wrote: > Clean backout of [JDK-8359256](https://bugs.openjdk.org/browse/JDK-8359256) due to failures with different tests in our CI (see JBS for details). > > Thanks, > Tobias Ouch. Sorry for the trouble. I focused on performance testing across multiple machines and missed an obvious jtreg failure in the process. My bad. I?ll come back with an updated PR next week. Thanks for fixing it! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28189#issuecomment-3502149712 From thartmann at openjdk.org Fri Nov 7 12:03:01 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 7 Nov 2025 12:03:01 GMT Subject: RFR: 8371458: [REDO] - Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 11:07:40 GMT, Ruben wrote: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Sure, I submitted testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3502158718 From eosterlund at openjdk.org Fri Nov 7 12:57:49 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 7 Nov 2025 12:57:49 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v17] In-Reply-To: References: Message-ID: > This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. > > The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. > > This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. > > The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. > > The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. > > Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. > Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the order they are laid out in... Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: - Fix test group anomaly - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Comment update - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - Merge branch 'master' into 8326035_JEP_object_streaming_v6 - remove include - Interned string value word accounting - Dont load all objects when JVMTI CFLH is on - ... and 24 more: https://git.openjdk.org/jdk/compare/167c952b...754006b5 ------------- Changes: https://git.openjdk.org/jdk/pull/27732/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27732&range=16 Stats: 8724 lines in 107 files changed: 5942 ins; 2318 del; 464 mod Patch: https://git.openjdk.org/jdk/pull/27732.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27732/head:pull/27732 PR: https://git.openjdk.org/jdk/pull/27732 From aboldtch at openjdk.org Fri Nov 7 12:57:49 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 7 Nov 2025 12:57:49 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v17] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 12:54:27 GMT, Erik ?sterlund wrote: >> This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. >> >> The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. >> >> This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. >> >> The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. >> >> The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. >> >> Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. >> Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the or... > > Erik ?sterlund has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: > > - Fix test group anomaly > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - Comment update > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - Merge branch 'master' into 8326035_JEP_object_streaming_v6 > - remove include > - Interned string value word accounting > - Dont load all objects when JVMTI CFLH is on > - ... and 24 more: https://git.openjdk.org/jdk/compare/167c952b...754006b5 Marked as reviewed by aboldtch (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27732#pullrequestreview-3433727151 From phubner at openjdk.org Fri Nov 7 13:31:04 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Fri, 7 Nov 2025 13:31:04 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 10:04:35 GMT, Thomas Stuefe wrote: > I would not add the helper function for only one case. We already have a bunch of helpers like these. Just test for oop != NULL at the call site. In the scenario I am referring to, we had an always-non-null oop and went through `CompressedKlassPointers::decode_not_null` (which has like 12 assertions). Most of the time, we failed because of the klass range check. In rare cases for this particular instance, but quite often in often in other cases like [JDK-8366794](https://bugs.openjdk.org/browse/JDK-8366794), the klass was null. I don't think it's sufficient to just check for oop nullness. I agree with you, it'd be nicer to check at the call site. How do you feel about inlining the proposed function into: if (obj != nullptr && obj->klass_without_asserts() == vmClasses::String_klass()) { java_lang_String::print(obj, st); print_address_on(st); } else // [...] ------------- PR Comment: https://git.openjdk.org/jdk/pull/28190#issuecomment-3502572955 From coleenp at openjdk.org Fri Nov 7 13:44:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 7 Nov 2025 13:44:03 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 09:08:33 GMT, Paul H?bner wrote: > Hi all, > > The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. > > For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. > > We observed the above in Valhalla and already patched it there. > > Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). I like the inlining option. ------------- PR Review: https://git.openjdk.org/jdk/pull/28190#pullrequestreview-3434008644 From phubner at openjdk.org Fri Nov 7 14:35:23 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Fri, 7 Nov 2025 14:35:23 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: > Hi all, > > The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. > > For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. > > We observed the above in Valhalla and already patched it there. > > Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: Don't make a new function. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28190/files - new: https://git.openjdk.org/jdk/pull/28190/files/e8be53d0..b5991b7f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28190&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28190&range=00-01 Stats: 12 lines in 3 files changed: 3 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28190.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28190/head:pull/28190 PR: https://git.openjdk.org/jdk/pull/28190 From coleenp at openjdk.org Fri Nov 7 14:44:04 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 7 Nov 2025 14:44:04 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 14:35:23 GMT, Paul H?bner wrote: >> Hi all, >> >> The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. >> >> For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. >> >> We observed the above in Valhalla and already patched it there. >> >> Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). > > Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: > > Don't make a new function. This looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28190#pullrequestreview-3434385522 From asmehra at openjdk.org Fri Nov 7 14:44:32 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Nov 2025 14:44:32 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly Message-ID: The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. In addition, it also updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally. Users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. ------------- Commit messages: - 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly Changes: https://git.openjdk.org/jdk/pull/28197/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28197&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371418 Stats: 72 lines in 7 files changed: 35 ins; 10 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/28197.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28197/head:pull/28197 PR: https://git.openjdk.org/jdk/pull/28197 From eosterlund at openjdk.org Fri Nov 7 14:50:52 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 7 Nov 2025 14:50:52 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v18] In-Reply-To: References: Message-ID: > This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. > > The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. > > This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. > > The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. > > The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. > > Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. > Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the order they are laid out in... Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Remove -server in test for static GHA build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27732/files - new: https://git.openjdk.org/jdk/pull/27732/files/754006b5..84d79301 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27732&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27732&range=16-17 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/27732.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27732/head:pull/27732 PR: https://git.openjdk.org/jdk/pull/27732 From aboldtch at openjdk.org Fri Nov 7 14:50:53 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 7 Nov 2025 14:50:53 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v18] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 14:47:07 GMT, Erik ?sterlund wrote: >> This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. >> >> The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. >> >> This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. >> >> The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. >> >> The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. >> >> Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. >> Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the or... > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Remove -server in test for static GHA build Marked as reviewed by aboldtch (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27732#pullrequestreview-3434398193 From jkratochvil at openjdk.org Fri Nov 7 14:59:47 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 7 Nov 2025 14:59:47 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v8] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Simplify STATIC_ASSERT according to Ioi Lam ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/b789941f..f3a0928d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=06-07 Stats: 10 lines in 1 file changed: 0 ins; 8 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From jkratochvil at openjdk.org Fri Nov 7 15:11:15 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 7 Nov 2025 15:11:15 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 04:48:18 GMT, Ioi Lam wrote: >> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix 32-bit compilation error > > src/hotspot/share/oops/resolvedMethodEntry.cpp line 43: > >> 41: STATIC_ASSERT(sizeof(ResolvedMethodEntry) == 16); >> 42: # endif >> 43: #endif > > I think this can be cleaned up as: > > > #ifdef _LP64 > STATIC_ASSERT(sizeof(ResolvedMethodEntry) == DEBUG_ONLY(32) NOT_DEBUG(24)); > #else > STATIC_ASSERT(sizeof(ResolvedMethodEntry) == DEBUG_ONLY(20) NOT_DEBUG(16)); > #endif > > > But I think this will be better without the need to hard code numbers: > > > // There should be no more padding at the end of ResolvedMethodEntry > class ResolvedMethodEntryWithExtra : public ResolvedMethodEntry { > u1 _extra_field; > }; > STATIC_ASSERT(sizeof(ResolvedMethodEntryWithExtra) > sizeof(ResolvedMethodEntry)); > > > I tested by changing `u4 _padding2` to `u4 _padding2` in `ResolvedMethodEntry` and the static assert fails. This STATIC_ASSERT does work for the trailing padding but not for potential padding introduced inside the struct. That sizeof(x)==number of mine is also not foolproof, but IMHO it is slightly safer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2504053345 From jkratochvil at openjdk.org Fri Nov 7 15:27:48 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 7 Nov 2025 15:27:48 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v9] In-Reply-To: References: Message-ID: <5unw3lzFuTKj9pZnKiVXZozP5M6y1oGApADg_bHgILo=.b4b4a98e-66f5-43d7-9b3f-4e5cb18c8d1c@github.com> > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Replace the memsets ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/f3a0928d..82643c0c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=07-08 Stats: 9 lines in 2 files changed: 0 ins; 6 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From eosterlund at openjdk.org Fri Nov 7 15:28:46 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 7 Nov 2025 15:28:46 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v18] In-Reply-To: References: Message-ID: <4O9v08uY1viSeMh_w821RNfKj67p74y2PqDrB8GdZCs=.e21a3d53-4a00-4f4a-99dc-589b1044d7bd@github.com> On Fri, 7 Nov 2025 14:50:52 GMT, Erik ?sterlund wrote: >> This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. >> >> The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. >> >> This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. >> >> The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. >> >> The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. >> >> Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. >> Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the or... > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Remove -server in test for static GHA build Thank you for the reviews everyone! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27732#issuecomment-3503184650 From eosterlund at openjdk.org Fri Nov 7 15:32:03 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 7 Nov 2025 15:32:03 GMT Subject: Integrated: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC In-Reply-To: References: Message-ID: On Thu, 9 Oct 2025 15:18:16 GMT, Erik ?sterlund wrote: > This is the implementation of JEP 516: Ahead-of-Time Object Caching with Any GC. > > The current mechanism for the AOT cache to cache heap objects is by using mmap to place bytes from a file directly in the GC managed heap. This mechanism poses compatibility challenges that all GCs have to have bit by bit identical object and reference formats, as the layout decisions are offline. This has so far meant that AOT cache optimizations requiring heap objects are not available when using ZGC. This work ensures that all GCs, including ZGC, are able to use the more advanced AOT cache functionality going forward. > > This JEP introduces a new mechanism for archiving a primordial heap, without such compatibility problems. It embraces online layouts and allocates objects one by one, linking them using the Access API, like normal objects. This way, archived objects quack like any other object to the GC, and the GC implementations are decoupled from the archiving mechanism. > > The key to doing this GC agnostic object loading is to represent object references between objects as object indices (e.g. 1, 2, 3) instead of raw pointers that we hope all GCs will recognise the same. These object indices become the key way of identifying objects. One table maps object indices to archived objects, and another table maps object indices to heap objects that have been allocated at runtime. This allows online linking of the materialized heap objects. > > The main interface to the cached heap is roots. Different components can register object roots at dump time. Each root gets assigned a root index. At runtime, requests can be made to get a reference to an object at a root index. The new implementation uses lazy materialization and concurrency. When a thread asks for a root object, it must ensure that the given root object and its transitively reachable objects are reachable. A new background thread called the AOTThread, tries to perform the bulk of the work, so that the startup impact of processing the objects one by one is not impacting the bootstrapping thread. > > Since the background thread performs the bulk of the work, the archived is laid out to ensure it can run as fast as possible. > Objects are laid out inf DFS pre order over the roots in the archive, such that the object indices and the DFS traversal orders are the same. This way, the DFS traversal that the background thread is performing is the same order as linearly materializing the objects one by one in the order they are laid out in... This pull request has now been integrated. Changeset: c8656449 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/c8656449c28581ae9c3d815105e338e42253bb43 Stats: 8726 lines in 108 files changed: 5942 ins; 2318 del; 466 mod 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC Co-authored-by: Axel Boldt-Christmas Co-authored-by: Joel Sikstr?m Co-authored-by: Stefan Karlsson Reviewed-by: aboldtch, iklam, kvn ------------- PR: https://git.openjdk.org/jdk/pull/27732 From kvn at openjdk.org Fri Nov 7 15:34:27 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 7 Nov 2025 15:34:27 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: <4Cyqe6dY27J8AnLfhLTyrliGYxnMdCJtAdGfJy2Ypxw=.cd37b689-667d-4e55-888b-fc4b7d440792@github.com> References: <4Cyqe6dY27J8AnLfhLTyrliGYxnMdCJtAdGfJy2Ypxw=.cd37b689-667d-4e55-888b-fc4b7d440792@github.com> Message-ID: On Fri, 7 Nov 2025 11:16:20 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > >> Thanks @ruben-arm. Let's go with this clean back out for now - testing just came back clean. Once you have the REDO ready, I can submit some more sophisticated testing to make sure it addresses the issues we're seeing. > > @TobiHartmann, I would appreciate it if you could initiate the extended testing for this PR. @ruben-arm what was the issue in the original changes? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3503214174 From duke at openjdk.org Fri Nov 7 15:50:04 2025 From: duke at openjdk.org (Ruben) Date: Fri, 7 Nov 2025 15:50:04 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: References: <4Cyqe6dY27J8AnLfhLTyrliGYxnMdCJtAdGfJy2Ypxw=.cd37b689-667d-4e55-888b-fc4b7d440792@github.com> Message-ID: On Fri, 7 Nov 2025 15:31:27 GMT, Vladimir Kozlov wrote: >>> Thanks @ruben-arm. Let's go with this clean back out for now - testing just came back clean. Once you have the REDO ready, I can submit some more sophisticated testing to make sure it addresses the issues we're seeing. >> >> @TobiHartmann, I would appreciate it if you could initiate the extended testing for this PR. > > @ruben-arm what was the issue in the original changes? @vnkozlov, the `NativePostCallNop::check` https://github.com/openjdk/jdk/blob/e34a831814996be3e0a2df86b11b1718a76ea558/src/hotspot/cpu/x86/nativeInst_x86.hpp#L584 was reading 4 bytes from the perceived call site. That works when the check happens for an actual call site, as the post-call NOP sequence is always longer than 4 bytes. However, it fails in case the return address in the stack frame, during deoptimization, is patched to point to the deoptimization stub code entry point. With the change, the distance between the entry point and end of code blob can now be just 2 bytes - consequently the 4-bytes read would read outside the code blob. The proposed fix is to split the read in the post-call NOP check into two 2-byte reads. If first read+comparison doesn't confirm it might be a post-call NOP sequence (it never will for the deoptimization stub code entry point) the second read wouldn't happen. I initially missed this issue - having incorrectly concluded that the `jmp` at the entry point will take 5 bytes in size instead of 2 bytes. https://github.com/openjdk/jdk/pull/28192/commits/7bb43523b3c9d1495d72a5bc75c3912c3f51e64c should address this issue by splitting the faulting read into two. It also adjusts the expected size of deoptimization handler stub code to 7 bytes in total instead of the incorrect estimate of 10 bytes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3503297153 From iklam at openjdk.org Fri Nov 7 16:03:19 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Nov 2025 16:03:19 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 15:08:07 GMT, Jan Kratochvil wrote: >> src/hotspot/share/oops/resolvedMethodEntry.cpp line 43: >> >>> 41: STATIC_ASSERT(sizeof(ResolvedMethodEntry) == 16); >>> 42: # endif >>> 43: #endif >> >> I think this can be cleaned up as: >> >> >> #ifdef _LP64 >> STATIC_ASSERT(sizeof(ResolvedMethodEntry) == DEBUG_ONLY(32) NOT_DEBUG(24)); >> #else >> STATIC_ASSERT(sizeof(ResolvedMethodEntry) == DEBUG_ONLY(20) NOT_DEBUG(16)); >> #endif >> >> >> But I think this will be better without the need to hard code numbers: >> >> >> // There should be no more padding at the end of ResolvedMethodEntry >> class ResolvedMethodEntryWithExtra : public ResolvedMethodEntry { >> u1 _extra_field; >> }; >> STATIC_ASSERT(sizeof(ResolvedMethodEntryWithExtra) > sizeof(ResolvedMethodEntry)); >> >> >> I tested by changing `u4 _padding2` to `u4 _padding2` in `ResolvedMethodEntry` and the static assert fails. > > This STATIC_ASSERT does work for the trailing padding but not for potential padding introduced inside the struct. That sizeof(x)==number of mine is also not foolproof, but IMHO it is slightly safer. I don't think either approach can detect internal padding. E.g., I changed this: # ifdef _LP64 - u2 _padding1; + u1 _padding1; u4 _padding2; Neither static assert failed. But if I changed this: # ifdef _LP64 u2 _padding1; - u4 _padding2; + u2 _padding2; My assert fails but yours doesn't. So hard coding a number cannot detect trailing paddings. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2504302756 From mdoerr at openjdk.org Fri Nov 7 16:01:01 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 7 Nov 2025 16:01:01 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 14:35:23 GMT, Paul H?bner wrote: >> Hi all, >> >> The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. >> >> For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. >> >> We observed the above in Valhalla and already patched it there. >> >> Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). > > Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: > > Don't make a new function. Thanks for fixing it! I think there are more places which cause "error occurred during error reporting" due to assertions, but they can get addressed in separate issues. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28190#pullrequestreview-3434852888 From kvn at openjdk.org Fri Nov 7 16:06:04 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 7 Nov 2025 16:06:04 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 14:38:48 GMT, Ashutosh Mehra wrote: > The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. > In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. src/hotspot/share/classfile/compactHashtable.hpp line 309: > 307: > 308: template > 309: inline void iterate(Function& function) const { // lambda enabled API Add comment explaining when it is exiting, when iteration is interrupted to show difference from `iterate_all()` src/hotspot/share/runtime/sharedRuntime.cpp line 3415: > 3413: #endif // INCLUDE_CDS > 3414: if (!found) { > 3415: auto findblob_runtime_table = [&] (AdapterFingerPrint* key, AdapterHandlerEntry* handler) { Why do you need to pass `AdapterFingerPrint* key` argument which is not used here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28197#discussion_r2504305349 PR Review Comment: https://git.openjdk.org/jdk/pull/28197#discussion_r2504312085 From phubner at openjdk.org Fri Nov 7 16:15:06 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Fri, 7 Nov 2025 16:15:06 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 10:04:35 GMT, Thomas Stuefe wrote: >> Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: >> >> Don't make a new function. > > I would not add the helper function for only one case. We already have a bunch of helpers like these. Just test for oop != NULL at the call site. Thanks for the feedback and reviews @tstuefe @coleenp @TheRealMDoerr. I'll integrate this on Monday when the 24h period has passed and my test re-runs are finished. > I think there are more places which cause "error occurred during error reporting" due to assertions, but they can get addressed in separate issues. If you have any examples, either now or down the road, I'd be happy to take a look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28190#issuecomment-3503445526 From asmehra at openjdk.org Fri Nov 7 16:15:06 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Nov 2025 16:15:06 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly In-Reply-To: References: Message-ID: <1P_abhv-WWTdNWTe3mF8eMth7MqvuacUYHftwSquzGE=.d62972b4-ee6d-413e-bfa6-454042da55a9@github.com> On Fri, 7 Nov 2025 16:00:35 GMT, Vladimir Kozlov wrote: >> The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. >> In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. > > src/hotspot/share/runtime/sharedRuntime.cpp line 3415: > >> 3413: #endif // INCLUDE_CDS >> 3414: if (!found) { >> 3415: auto findblob_runtime_table = [&] (AdapterFingerPrint* key, AdapterHandlerEntry* handler) { > > Why do you need to pass `AdapterFingerPrint* key` argument which is not used here? This is because the `function` passed to `HashTable::iterate` is called with both key and value: https://github.com/openjdk/jdk/blob/c8656449c28581ae9c3d815105e338e42253bb43/src/hotspot/share/utilities/hashTable.hpp#L272 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28197#discussion_r2504370778 From mdoerr at openjdk.org Fri Nov 7 16:19:05 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 7 Nov 2025 16:19:05 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 10:04:35 GMT, Thomas Stuefe wrote: >> Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: >> >> Don't make a new function. > > I would not add the helper function for only one case. We already have a bunch of helpers like these. Just test for oop != NULL at the call site. > Thanks for the feedback and reviews @tstuefe @coleenp @TheRealMDoerr. I'll integrate this on Monday when the 24h period has passed and my test re-runs are finished. > > > I think there are more places which cause "error occurred during error reporting" due to assertions, but they can get addressed in separate issues. > > If you have any examples, either now or down the road, I'd be happy to take a look. An prominent example in the debug build is the following assertion: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/compressedKlass.inline.hpp#L91 The code is used by CompressedKlassPointers::decode_not_null. The error reporting code shouldn't use the version with assertions while investigating if a value may be a compressed class pointer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28190#issuecomment-3503465658 From asmehra at openjdk.org Fri Nov 7 16:22:39 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Nov 2025 16:22:39 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: References: Message-ID: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> > The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. > In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Add comments Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28197/files - new: https://git.openjdk.org/jdk/pull/28197/files/b827e012..f46d3dae Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28197&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28197&range=00-01 Stats: 10 lines in 1 file changed: 6 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28197.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28197/head:pull/28197 PR: https://git.openjdk.org/jdk/pull/28197 From asmehra at openjdk.org Fri Nov 7 16:22:41 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Nov 2025 16:22:41 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 15:59:21 GMT, Vladimir Kozlov wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Add comments >> >> Signed-off-by: Ashutosh Mehra > > src/hotspot/share/classfile/compactHashtable.hpp line 309: > >> 307: >> 308: template >> 309: inline void iterate(Function& function) const { // lambda enabled API > > Add comment explaining when it is exiting, when iteration is interrupted to show difference from `iterate_all()` Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28197#discussion_r2504396799 From kvn at openjdk.org Fri Nov 7 16:31:07 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 7 Nov 2025 16:31:07 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: <1P_abhv-WWTdNWTe3mF8eMth7MqvuacUYHftwSquzGE=.d62972b4-ee6d-413e-bfa6-454042da55a9@github.com> References: <1P_abhv-WWTdNWTe3mF8eMth7MqvuacUYHftwSquzGE=.d62972b4-ee6d-413e-bfa6-454042da55a9@github.com> Message-ID: On Fri, 7 Nov 2025 16:12:13 GMT, Ashutosh Mehra wrote: >> src/hotspot/share/runtime/sharedRuntime.cpp line 3415: >> >>> 3413: #endif // INCLUDE_CDS >>> 3414: if (!found) { >>> 3415: auto findblob_runtime_table = [&] (AdapterFingerPrint* key, AdapterHandlerEntry* handler) { >> >> Why do you need to pass `AdapterFingerPrint* key` argument which is not used here? > > This is because the `function` passed to `HashTable::iterate` is called with both key and value: https://github.com/openjdk/jdk/blob/c8656449c28581ae9c3d815105e338e42253bb43/src/hotspot/share/utilities/hashTable.hpp#L272 Thank you for pointing this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28197#discussion_r2504449829 From jkratochvil at openjdk.org Fri Nov 7 16:31:50 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 7 Nov 2025 16:31:50 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v10] In-Reply-To: References: Message-ID: <6U-2ntDEtH64ZZsBldjP-tVWC17WOjEhLH1Ywj51zJQ=.b55260d5-ef7f-49fb-8a58-5ee1ec776161@github.com> > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Add trailing padding detection by Ioi Lam ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/82643c0c..5b1edb77 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=08-09 Stats: 16 lines in 2 files changed: 16 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From jkratochvil at openjdk.org Fri Nov 7 16:31:53 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 7 Nov 2025 16:31:53 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 15:58:56 GMT, Ioi Lam wrote: >> This STATIC_ASSERT does work for the trailing padding but not for potential padding introduced inside the struct. That sizeof(x)==number of mine is also not foolproof, but IMHO it is slightly safer. > > I don't think either approach can detect internal padding. E.g., I changed this: > > > # ifdef _LP64 > - u2 _padding1; > + u1 _padding1; > u4 _padding2; > > > Neither static assert failed. > > But if I changed this: > > > # ifdef _LP64 > u2 _padding1; > - u4 _padding2; > + u2 _padding2; > > > My assert fails but yours doesn't. So hard coding a number cannot detect trailing paddings. True. So I have added both. That sizeof check is for the case of: } _entry_specific; + void *_new_pointer; u2 _cpool_index; // Constant pool index ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2504453294 From kvn at openjdk.org Fri Nov 7 16:41:12 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 7 Nov 2025 16:41:12 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> References: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> Message-ID: On Fri, 7 Nov 2025 16:22:39 GMT, Ashutosh Mehra wrote: >> The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. >> In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Add comments > > Signed-off-by: Ashutosh Mehra Good. I submitted testing. @ashu-mehra, do you know what issue current code (before these changes) could cause? ------------- PR Review: https://git.openjdk.org/jdk/pull/28197#pullrequestreview-3435100962 PR Comment: https://git.openjdk.org/jdk/pull/28197#issuecomment-3503574345 From jkratochvil at openjdk.org Fri Nov 7 17:07:37 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 7 Nov 2025 17:07:37 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v11] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Remove the sizeof == const assertions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/5b1edb77..73ae03bd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=09-10 Stats: 14 lines in 2 files changed: 0 ins; 14 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From iklam at openjdk.org Fri Nov 7 17:07:38 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Nov 2025 17:07:38 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: References: Message-ID: <2Rv42JnWXTtA6fyhCoQ2XrQK9z2Jmb6fUiT1cyY7Kz0=.6f7c7568-da7f-48c0-9443-88b67f8ea815@github.com> On Fri, 7 Nov 2025 16:28:35 GMT, Jan Kratochvil wrote: >> I don't think either approach can detect internal padding. E.g., I changed this: >> >> >> # ifdef _LP64 >> - u2 _padding1; >> + u1 _padding1; >> u4 _padding2; >> >> >> Neither static assert failed. >> >> But if I changed this: >> >> >> # ifdef _LP64 >> u2 _padding1; >> - u4 _padding2; >> + u2 _padding2; >> >> >> My assert fails but yours doesn't. So hard coding a number cannot detect trailing paddings. > > True. So I have added both. That sizeof check is for the case of: > > class ResolvedFieldEntry { > friend class VMStructs; > > + u4 _new_field; > InstanceKlass* _field_holder; // Field holder klass > > Although it is a bit difficult to introduce an accidental padding inside the struct which is caught by my assertion and not by your assertion. So I can also remove the assertion of mine if you think so. My assert also fails for your new scenario. I think we shouldn't have the hard coded numbers unless you find a case not covered by my assert. Otherwise when someone changes the class they need to update 4 hard coded numbers, which can be error prone (most people don't have 32-bit build environments). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2504597822 From jkratochvil at openjdk.org Fri Nov 7 17:07:39 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Fri, 7 Nov 2025 17:07:39 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v7] In-Reply-To: <2Rv42JnWXTtA6fyhCoQ2XrQK9z2Jmb6fUiT1cyY7Kz0=.6f7c7568-da7f-48c0-9443-88b67f8ea815@github.com> References: <2Rv42JnWXTtA6fyhCoQ2XrQK9z2Jmb6fUiT1cyY7Kz0=.6f7c7568-da7f-48c0-9443-88b67f8ea815@github.com> Message-ID: On Fri, 7 Nov 2025 16:59:18 GMT, Ioi Lam wrote: >> True. So I have added both. That sizeof check is for the case of: >> >> class ResolvedFieldEntry { >> friend class VMStructs; >> >> + u4 _new_field; >> InstanceKlass* _field_holder; // Field holder klass >> >> Although it is a bit difficult to introduce an accidental padding inside the struct which is caught by my assertion and not by your assertion. So I can also remove the assertion of mine if you think so. > > My assert also fails for your new scenario. I think we shouldn't have the hard coded numbers unless you find a case not covered by my assert. Otherwise when someone changes the class they need to update 4 hard coded numbers, which can be error prone (most people don't have 32-bit build environments). Such a case I have provided above (after editing it) but I have removed the assertion as its usefulness is not big. A 32-bit build is being tested by arm32 Github GHA. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2504615209 From sgehwolf at openjdk.org Fri Nov 7 17:43:18 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 7 Nov 2025 17:43:18 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 Message-ID: Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. Testing (all on Linux x86_64): - [x] CG version 2, run as root. Engine: docker. Test is skipped. - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) - [X] CG version 1, run as non-root. Test skipped. - [x] GHA, though I don't think this is very useful for this change. Thoughts? ------------- Commit messages: - 8370966: Create regression test for hierarchical memory limit fix in JDK-8370572 Changes: https://git.openjdk.org/jdk/pull/28201/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28201&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370966 Stats: 191 lines in 8 files changed: 154 ins; 32 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28201.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28201/head:pull/28201 PR: https://git.openjdk.org/jdk/pull/28201 From kbarrett at openjdk.org Fri Nov 7 19:07:03 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 7 Nov 2025 19:07:03 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v5] In-Reply-To: References: <2tqtSNmhY0bGDrqu06wBvUWw_bpdv311BSM-ij5EEGY=.90ef7eb5-8d88-439c-b0df-f917f3543cc2@github.com> Message-ID: On Fri, 7 Nov 2025 11:01:48 GMT, Afshin Zafari wrote: >> src/hotspot/share/oops/klass.hpp line 524: >> >>> 522: // Use well known bit hack to isolate the low bit of candidates. >>> 523: // The usual form is (x & -x), but VS warns (C4146) about unary minus of unsigned. >>> 524: // So use alternate form of negation to avoid warning. >> >> "So explicitly use two's complement to avoid warning" > > Done. Alternatively, use `PRAGMA_DISABLE_MSVC_WARNING(4146)` to kill the warning. I think there are a couple of places where we've done that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2505047897 From asmehra at openjdk.org Fri Nov 7 19:38:04 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Nov 2025 19:38:04 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: References: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> Message-ID: On Fri, 7 Nov 2025 16:38:54 GMT, Vladimir Kozlov wrote: > do you know what issue current code (before these changes) could cause? `AdapterHandlerLibrary::print_handler_on` is called from `os::print_location() -> codeBlob::dump_for_addr()`. `os::print_location()` is only used in error reporting to print the location of the address. With the current code (before this patch) `AdapterHandlerLibrary::print_handler_on` could have returned without printing anything, even if the `CodeBlob` passed as parameter is of type `AdapterBlob`. In debug builds it could also trigger the assert: assert(found, "Should have found handler"); btw the change that introduced this bug was made more than 3 years ago in https://bugs.openjdk.org/browse/JDK-8292384 and it went in JDK 20. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28197#issuecomment-3504526393 From asmehra at openjdk.org Fri Nov 7 20:19:07 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Nov 2025 20:19:07 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: References: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> Message-ID: <-CPyR66QCD3NHKuXrwEpqM4jcC1ZJ1LkbVfUiRDSNYU=.a7503d80-ce00-44c3-92ef-04e5e198e183@github.com> On Fri, 7 Nov 2025 16:38:54 GMT, Vladimir Kozlov wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Add comments >> >> Signed-off-by: Ashutosh Mehra > > @ashu-mehra, do you know what issue current code (before these changes) could cause? @vnkozlov fyi - I also opened https://bugs.openjdk.org/browse/JDK-8371493 which is going to touch the same code as this patch. I didn't includes the changes in this patch to make it easier to backport this patch if needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28197#issuecomment-3504726833 From phh at openjdk.org Fri Nov 7 22:13:07 2025 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 7 Nov 2025 22:13:07 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v9] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Fri, 7 Nov 2025 05:08:20 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > remove blanck line Changes requested by phh (Reviewer). src/hotspot/os/bsd/os_bsd.cpp line 107: > 105: #include > 106: #include > 107: #include You don't need this here because you included it in os_bsd.hpp. ------------- PR Review: https://git.openjdk.org/jdk/pull/27868#pullrequestreview-3436641793 PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2505700791 From kvn at openjdk.org Fri Nov 7 22:38:03 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 7 Nov 2025 22:38:03 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: References: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> Message-ID: On Fri, 7 Nov 2025 16:38:54 GMT, Vladimir Kozlov wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Add comments >> >> Signed-off-by: Ashutosh Mehra > > @ashu-mehra, do you know what issue current code (before these changes) could cause? > @vnkozlov fyi - I also opened https://bugs.openjdk.org/browse/JDK-8371493 which is going to touch the same code as this patch. I didn't includes the changes in this patch to make it easier to backport this patch if needed. Good. We usually don't port enhancement but we can consider it since its simplification of printing code which should not affect code execution. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28197#issuecomment-3505261288 From duke at openjdk.org Sat Nov 8 00:05:38 2025 From: duke at openjdk.org (Nityanand Rai) Date: Sat, 8 Nov 2025 00:05:38 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v10] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: remove unncessary included ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/d8d09007..83f63ce5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Sat Nov 8 00:05:40 2025 From: duke at openjdk.org (Nityanand Rai) Date: Sat, 8 Nov 2025 00:05:40 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v9] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Fri, 7 Nov 2025 22:09:40 GMT, Paul Hohensee wrote: >> Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: >> >> remove blanck line > > src/hotspot/os/bsd/os_bsd.cpp line 107: > >> 105: #include >> 106: #include >> 107: #include > > You don't need this here because you included it in os_bsd.hpp. Yes, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2505882636 From jbhateja at openjdk.org Sat Nov 8 02:22:17 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sat, 8 Nov 2025 02:22:17 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations Message-ID: Add new HalffloatVector type and corresponding concrete vector classes in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. - Add necessary inline expander support. - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. - Use existing Float16 vector IR and backend support. - Extended the existing VectorAPI JTREG test suite for the newly added HalffloatVector operations. The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). The following are the performance numbers for some of the selected HalfflotVector benchmarking kernels compared to equivalent Float16OperationsBenchmark kernels. {A2BA2D85-085A-489F-8DDD-0FCFB5986EA5} Initial RFP[1] was floated on the panama-dev mailing list. Kindly review the draft PR and share your feedback. Best Regards, Jatin [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html ------------- Commit messages: - Some cleanups - Fix some JTREG failures - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8370691 - Revamped JTreg test generation and bug fixes - Cleanups - Removing redundant warmup constraint - Adding a HalffloatVectorBenchmark having benchmarking kernel parity with Float16OperationsBenchmark - Adding IR Framework test - Fix JTREG failures - Build failure fixes - ... and 1 more: https://git.openjdk.org/jdk/compare/e34a8318...c60d533c Changes: https://git.openjdk.org/jdk/pull/28002/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28002&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370691 Stats: 66541 lines in 134 files changed: 54467 ins; 460 del; 11614 mod Patch: https://git.openjdk.org/jdk/pull/28002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28002/head:pull/28002 PR: https://git.openjdk.org/jdk/pull/28002 From lmesnik at openjdk.org Sat Nov 8 21:34:11 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sat, 8 Nov 2025 21:34:11 GMT Subject: RFR: 8371103: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java failing In-Reply-To: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> References: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> Message-ID: On Thu, 6 Nov 2025 21:26:24 GMT, Leonid Mesnik wrote: > The problem happens because jvmti events are posted while handling JvmtiThreadState_lock. The fix just to move > flushing out of lock like it is already done in `JvmtiEventController::set_user_enabled(..)` method. > > The problem start reproducing after fix for https://bugs.openjdk.org/browse/JDK-8370732 that replaced GC triggering from slow and unreliable `ClassUnloader.eatMemory();` to fast and robust`WhiteBox.fullGC()`. > > The jvmti events posting is not synchronized with enabling/disabling events and setting callbacks. So even if there are new events appear in the jvmti tagmap after flushing it is not a bug to don't post them or use new callback handler. > > Also, it might be makes sense to flush object events before vm_death and post all deferred events from SerrviceThread queue. > I am going to file separate RFE for this. > Also, I am going to file RFE to replace all GC provoking the `eatMemory()` calls with `WB.fullGC()` to improve test stability and reduce test execution time. @sspitsyn, @alexmenkov Thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28184#issuecomment-3506944399 From lmesnik at openjdk.org Sat Nov 8 21:34:11 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sat, 8 Nov 2025 21:34:11 GMT Subject: Integrated: 8371103: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java failing In-Reply-To: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> References: <0TSSeNB540PtvusDCXFoYLc4TJm8GJMeBlExphuFDic=.5ff05b43-91c3-497f-b17a-1887c538c1ce@github.com> Message-ID: On Thu, 6 Nov 2025 21:26:24 GMT, Leonid Mesnik wrote: > The problem happens because jvmti events are posted while handling JvmtiThreadState_lock. The fix just to move > flushing out of lock like it is already done in `JvmtiEventController::set_user_enabled(..)` method. > > The problem start reproducing after fix for https://bugs.openjdk.org/browse/JDK-8370732 that replaced GC triggering from slow and unreliable `ClassUnloader.eatMemory();` to fast and robust`WhiteBox.fullGC()`. > > The jvmti events posting is not synchronized with enabling/disabling events and setting callbacks. So even if there are new events appear in the jvmti tagmap after flushing it is not a bug to don't post them or use new callback handler. > > Also, it might be makes sense to flush object events before vm_death and post all deferred events from SerrviceThread queue. > I am going to file separate RFE for this. > Also, I am going to file RFE to replace all GC provoking the `eatMemory()` calls with `WB.fullGC()` to improve test stability and reduce test execution time. This pull request has now been integrated. Changeset: 88c4678e Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/88c4678eed818cbe9380f35352e90883fed27d33 Stats: 8 lines in 2 files changed: 4 ins; 4 del; 0 mod 8371103: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t006/TestDescription.java failing Reviewed-by: amenkov, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/28184 From swen at openjdk.org Mon Nov 10 00:46:29 2025 From: swen at openjdk.org (Shaojin Wen) Date: Mon, 10 Nov 2025 00:46:29 GMT Subject: RFR: 8371431: Warning message when turning off CompactStrings Message-ID: A warning message should be given before removing the CompactStrings off option. ------------- Commit messages: - Merge branch 'master' into compact_str_warn_2510 - from @liach - add warnings Changes: https://git.openjdk.org/jdk/pull/27995/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27995&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371431 Stats: 13 lines in 3 files changed: 1 ins; 9 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/27995.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27995/head:pull/27995 PR: https://git.openjdk.org/jdk/pull/27995 From liach at openjdk.org Mon Nov 10 00:46:29 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 10 Nov 2025 00:46:29 GMT Subject: RFR: 8371431: Warning message when turning off CompactStrings In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 01:26:18 GMT, Shaojin Wen wrote: > A warning message should be given before removing the CompactStrings off option. I think the code change looks like: diff --git a/src/hotspot/cpu/arm/globals_arm.hpp b/src/hotspot/cpu/arm/globals_arm.hpp index 363a9a2c25c..d64615dd142 100644 --- a/src/hotspot/cpu/arm/globals_arm.hpp +++ b/src/hotspot/cpu/arm/globals_arm.hpp @@ -69,7 +69,7 @@ define_pd_global(bool, PreserveFramePointer, false); define_pd_global(uintx, TypeProfileLevel, 0); // No performance work done here yet. -define_pd_global(bool, CompactStrings, false); +define_pd_global(bool, CompactStrings, true); define_pd_global(intx, InitArrayShortSize, 8*BytesPerLong); diff --git a/src/hotspot/share/runtime/arguments.cpp b/src/hotspot/share/runtime/arguments.cpp index 0d92f22af79..ac9eb7b6932 100644 --- a/src/hotspot/share/runtime/arguments.cpp +++ b/src/hotspot/share/runtime/arguments.cpp @@ -541,6 +541,7 @@ static SpecialFlag const special_jvm_flags[] = { // -------------- Obsolete Flags - sorted by expired_in -------------- + { "CompactStrings", JDK_Version::jdk(25), JDK_Version::jdk(27), JDK_Version::jdk(28) }, #ifdef LINUX { "UseOprofile", JDK_Version::jdk(25), JDK_Version::jdk(26), JDK_Version::jdk(27) }, #endif And backport the arguments.cpp but not globals_arm back to 25 updates. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27995#issuecomment-3453150014 From dholmes at openjdk.org Mon Nov 10 00:46:29 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 10 Nov 2025 00:46:29 GMT Subject: RFR: 8371431: Warning message when turning off CompactStrings In-Reply-To: References: Message-ID: <0Pyo3XIHelfYv00yWKxmqUPxwxJTYJkh2G2lZYqVNB8=.b66e01b5-cfb5-4b55-8d53-657fc63137b8@github.com> On Mon, 27 Oct 2025 20:21:03 GMT, Chen Liang wrote: >> A warning message should be given before removing the CompactStrings off option. > > I think the code change looks like: > > diff --git a/src/hotspot/cpu/arm/globals_arm.hpp b/src/hotspot/cpu/arm/globals_arm.hpp > index 363a9a2c25c..d64615dd142 100644 > --- a/src/hotspot/cpu/arm/globals_arm.hpp > +++ b/src/hotspot/cpu/arm/globals_arm.hpp > @@ -69,7 +69,7 @@ define_pd_global(bool, PreserveFramePointer, false); > define_pd_global(uintx, TypeProfileLevel, 0); > > // No performance work done here yet. > -define_pd_global(bool, CompactStrings, false); > +define_pd_global(bool, CompactStrings, true); > > define_pd_global(intx, InitArrayShortSize, 8*BytesPerLong); > > diff --git a/src/hotspot/share/runtime/arguments.cpp b/src/hotspot/share/runtime/arguments.cpp > index 0d92f22af79..ac9eb7b6932 100644 > --- a/src/hotspot/share/runtime/arguments.cpp > +++ b/src/hotspot/share/runtime/arguments.cpp > @@ -541,6 +541,7 @@ static SpecialFlag const special_jvm_flags[] = { > > // -------------- Obsolete Flags - sorted by expired_in -------------- > > + { "CompactStrings", JDK_Version::jdk(25), JDK_Version::jdk(27), JDK_Version::jdk(28) }, > #ifdef LINUX > { "UseOprofile", JDK_Version::jdk(25), JDK_Version::jdk(26), JDK_Version::jdk(27) }, > #endif > > > And backport the arguments.cpp but not globals_arm back to 25 updates. @liach we can't just enable CS for ARM-32-bit. Unclear if it even works if you try to do that, but there are no C2 intrinsics related to CS so performance could take quite a hit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27995#issuecomment-3465595986 From stuefe at openjdk.org Mon Nov 10 04:30:07 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 10 Nov 2025 04:30:07 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 14:35:23 GMT, Paul H?bner wrote: >> Hi all, >> >> The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. >> >> For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. >> >> We observed the above in Valhalla and already patched it there. >> >> Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). > > Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: > > Don't make a new function. > > I would not add the helper function for only one case. We already have a bunch of helpers like these. Just test for oop != NULL at the call site. > > In the scenario I am referring to, we had an always-non-null oop and went through `CompressedKlassPointers::decode_not_null` (which has like 12 assertions). Most of the time, we failed because of the klass range check. In rare cases for this particular instance, but quite often in often in other cases like [JDK-8366794](https://bugs.openjdk.org/browse/JDK-8366794), the klass was null. I don't think it's sufficient to just check for oop nullness. > > I agree with you, it'd be nicer to check at the call site. How do you feel about inlining the proposed function into: > > ```c++ > if (obj != nullptr && obj->klass_without_asserts() == vmClasses::String_klass()) { > java_lang_String::print(obj, st); > print_address_on(st); > } else // [...] > ``` Yes, that is what I meant. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28190#issuecomment-3509341804 From iklam at openjdk.org Mon Nov 10 04:32:04 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 10 Nov 2025 04:32:04 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v11] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 17:07:37 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Remove the sizeof == const assertions Looks good to me. Thanks for your patience in finding a good solution for this issue. I just have a minor nit about the comments. src/hotspot/share/oops/resolvedFieldEntry.hpp line 47: > 45: > 46: // Verify no compiler paddings are present, check STATIC_ASSERTs in the .cpp file. > 47: I think this comment is not necessary. src/hotspot/share/oops/resolvedMethodEntry.hpp line 65: > 63: > 64: // Verify no compiler paddings are present, check STATIC_ASSERTs in the .cpp file. > 65: Comment not necessary. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26098#pullrequestreview-3440751612 PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2508674536 PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2508675810 From jkratochvil at openjdk.org Mon Nov 10 05:18:57 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 10 Nov 2025 05:18:57 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v12] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: emove comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/73ae03bd..131eb451 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=10-11 Stats: 4 lines in 2 files changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From jkratochvil at openjdk.org Mon Nov 10 05:19:00 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 10 Nov 2025 05:19:00 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v11] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 04:28:02 GMT, Ioi Lam wrote: >> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove the sizeof == const assertions > > src/hotspot/share/oops/resolvedFieldEntry.hpp line 47: > >> 45: >> 46: // Verify no compiler paddings are present, check STATIC_ASSERTs in the .cpp file. >> 47: > > I think this comment is not necessary. I have removed it, but I disagree, as there?s no way to make such a foolproof assertion in C++. > src/hotspot/share/oops/resolvedMethodEntry.hpp line 65: > >> 63: >> 64: // Verify no compiler paddings are present, check STATIC_ASSERTs in the .cpp file. >> 65: > > Comment not necessary. I have removed it, but I disagree, as there?s no way to make such a foolproof assertion in C++. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2508742761 PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2508742947 From iklam at openjdk.org Mon Nov 10 05:33:03 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 10 Nov 2025 05:33:03 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v11] In-Reply-To: References: Message-ID: <46Pj-XHgdp-dhA3MM2RuRWEzPjGqe3pyjYXT967axcM=.dffde177-a0ec-44d7-99bf-855341cd5206@github.com> On Mon, 10 Nov 2025 05:15:01 GMT, Jan Kratochvil wrote: >> src/hotspot/share/oops/resolvedFieldEntry.hpp line 47: >> >>> 45: >>> 46: // Verify no compiler paddings are present, check STATIC_ASSERTs in the .cpp file. >>> 47: >> >> I think this comment is not necessary. > > I have removed it, but I disagree, as there?s no way to make such a foolproof assertion in C++. My objection to the original comments is they don't explain the purpose of the explicit paddings, or how to "verify no compiler paddings are present". I think it's better to add a comment like: // The explicit paddings are necessary for generating deterministic CDS archives. They prevent // the C++ compiler from potentially inserting random values in unused gaps. So if someone changes this file and the CDS tests fails, the comment will give them some ideas about how to fix the problem. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2508774649 From aboldtch at openjdk.org Mon Nov 10 06:19:11 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 10 Nov 2025 06:19:11 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v10] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Sat, 8 Nov 2025 00:05:38 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > remove unncessary included src/hotspot/os/bsd/os_bsd.hpp line 38: > 36: // Shared constant for mmap file descriptor used across BSD OS implementations > 37: static constexpr int bsd_mmap_fd = > 38: #if defined(__APPLE__) && defined(VM_MAKE_TAG) && defined(VM_MEMORY_JAVA) This did create some asymmetry now where this and the test is guarded, but the test utility which also requires `VM_MEMORY_JAVA` to be defined is not. I might be better to use this condition once in this header file and introduce a new define which we use as the condition for the tagged memory related code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2508888148 From aboldtch at openjdk.org Mon Nov 10 06:22:17 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 10 Nov 2025 06:22:17 GMT Subject: RFR: 8367149: Add convenient construction for creating ad-hoc VMErrorCallback [v6] In-Reply-To: References: Message-ID: On Fri, 10 Oct 2025 07:54:34 GMT, Axel Boldt-Christmas wrote: >> Add a class OnVMError which uses the VMErrorCallback mechanism which is a convenient construction for creating ad-hoc VMErrorCallback which automatically calls the provided invocable f if a VM crash occurs within its lifetime. Can be used to instrument a build for more detailed contextual information gathering. Especially useful when hunting down intermittent bugs, or issues only reproducible in environments where access to a debugger is not readily available. Example use: >> ```C++ >> { >> // Note the lambda is invoked after an error occurs within this thread, >> // and during on_error's lifetime. If state prior to the crash is required, >> // capture a copy of it first. >> auto important_value = get_the_value(); >> >> OnVMError on_error([&](outputStream* st) { >> // Dump the important bits. >> st->print("Prior value: "); >> important_value.print_on(st); >> st->print("During crash: ") >> get_the_value().print_on(st); >> // Dump whole the whole state. >> this->print_on(st); >> }); >> >> // Sometimes doing a thing will crash the VM. >> do_a_thing(); >> } >> >> >> C++17 class template argument deduction finally makes these sort of constructions ergonomic to use without the need for auto and helper construction methods. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge tag 'jdk-26+19' into JDK-8367149 > > Added tag jdk-26+19 for changeset b37a1a33 > - Merge tag 'jdk-26+18' into JDK-8367149 > > Added tag jdk-26+18 for changeset 5251405c > - Merge tag 'jdk-26+17' into JDK-8367149 > > Added tag jdk-26+17 for changeset 2aafda19 > - Replace ergonomic with convenient > - Add a comment explaining the deduction rules > - Skip multiple inheritance and allow more than lambda like callables. > - Update doc example > - 8367149: Add ergonomic construction for creating ad-hoc VMErrorCallback Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27159#issuecomment-3509619870 From aboldtch at openjdk.org Mon Nov 10 06:22:18 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 10 Nov 2025 06:22:18 GMT Subject: Integrated: 8367149: Add convenient construction for creating ad-hoc VMErrorCallback In-Reply-To: References: Message-ID: On Tue, 9 Sep 2025 06:26:32 GMT, Axel Boldt-Christmas wrote: > Add a class OnVMError which uses the VMErrorCallback mechanism which is a convenient construction for creating ad-hoc VMErrorCallback which automatically calls the provided invocable f if a VM crash occurs within its lifetime. Can be used to instrument a build for more detailed contextual information gathering. Especially useful when hunting down intermittent bugs, or issues only reproducible in environments where access to a debugger is not readily available. Example use: > ```C++ > { > // Note the lambda is invoked after an error occurs within this thread, > // and during on_error's lifetime. If state prior to the crash is required, > // capture a copy of it first. > auto important_value = get_the_value(); > > OnVMError on_error([&](outputStream* st) { > // Dump the important bits. > st->print("Prior value: "); > important_value.print_on(st); > st->print("During crash: ") > get_the_value().print_on(st); > // Dump whole the whole state. > this->print_on(st); > }); > > // Sometimes doing a thing will crash the VM. > do_a_thing(); > } > > > C++17 class template argument deduction finally makes these sort of constructions ergonomic to use without the need for auto and helper construction methods. This pull request has now been integrated. Changeset: d570765e Author: Axel Boldt-Christmas URL: https://git.openjdk.org/jdk/commit/d570765e2720a11c88c806554df9b13587a041a2 Stats: 48 lines in 1 file changed: 48 ins; 0 del; 0 mod 8367149: Add convenient construction for creating ad-hoc VMErrorCallback Reviewed-by: ayang, stefank ------------- PR: https://git.openjdk.org/jdk/pull/27159 From syan at openjdk.org Mon Nov 10 06:54:02 2025 From: syan at openjdk.org (SendaoYan) Date: Mon, 10 Nov 2025 06:54:02 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 16:41:26 GMT, Severin Gehwolf wrote: > Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. > > The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. > > Testing (all on Linux x86_64): > - [x] CG version 2, run as root. Engine: docker. Test is skipped. > - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [X] CG version 1, run as non-root. Test skipped. > - [x] GHA, though I don't think this is very useful for this change. > > Thoughts? test/hotspot/jtreg/containers/docker/TestMemoryInvisibleParent.java line 2: > 1: /* > 2: * Copyright (C) 2025, IBM Copyright (c) 2025 IBM Corporation. All rights reserved. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28201#discussion_r2508961551 From alanb at openjdk.org Mon Nov 10 07:13:02 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 10 Nov 2025 07:13:02 GMT Subject: RFR: 8371431: Warning message when turning off CompactStrings In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 01:26:18 GMT, Shaojin Wen wrote: > A warning message should be given before removing the CompactStrings off option. There is discussion on core-libs-dev about creating a JEP to document the story in a way that gets to the far reaches of the galaxy where -XX:-CompactStrings might be used. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27995#issuecomment-3509778753 From jkratochvil at openjdk.org Mon Nov 10 07:24:50 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 10 Nov 2025 07:24:50 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: Add Ioi Lam's comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/131eb451..ef3673ca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=11-12 Stats: 6 lines in 2 files changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From jkratochvil at openjdk.org Mon Nov 10 07:24:52 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 10 Nov 2025 07:24:52 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v11] In-Reply-To: <46Pj-XHgdp-dhA3MM2RuRWEzPjGqe3pyjYXT967axcM=.dffde177-a0ec-44d7-99bf-855341cd5206@github.com> References: <46Pj-XHgdp-dhA3MM2RuRWEzPjGqe3pyjYXT967axcM=.dffde177-a0ec-44d7-99bf-855341cd5206@github.com> Message-ID: <6uB1qDoC3a40J7Qmjs6PAtr47bFEX84B3ZbQKbk6oyE=.828fd16a-1150-4703-8ec3-907174a15231@github.com> On Mon, 10 Nov 2025 05:30:37 GMT, Ioi Lam wrote: >> I have removed it, but I disagree, as there?s no way to make such a foolproof assertion in C++. > > My objection to the original comments is they don't explain the purpose of the explicit paddings, or how to "verify no compiler paddings are present". I think it's better to add a comment like: > > > // The explicit paddings are necessary for generating deterministic CDS archives. They prevent > // the C++ compiler from potentially inserting random values in unused gaps. > > > So if someone changes this file and the CDS tests fails, the comment will give them some ideas about how to fix the problem. This is the best, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2509041030 From jsikstro at openjdk.org Mon Nov 10 08:57:12 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 10 Nov 2025 08:57:12 GMT Subject: Integrated: 8370813: Deprecate AggressiveHeap In-Reply-To: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> References: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> Message-ID: On Wed, 5 Nov 2025 09:24:51 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE deprecates the `AggressiveHeap` flag in JDK 26. Please see the CSR for specific details on why this flag is being deprecated and workarounds for users interested in keeping similar behavior in the future. This pull request has now been integrated. Changeset: 2c378e26 Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/2c378e26d7319b6b0e273d2409dd3f591c5f5f6b Stats: 16 lines in 3 files changed: 8 ins; 6 del; 2 mod 8370813: Deprecate AggressiveHeap Reviewed-by: ayang, shade ------------- PR: https://git.openjdk.org/jdk/pull/28144 From jsikstro at openjdk.org Mon Nov 10 08:57:10 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 10 Nov 2025 08:57:10 GMT Subject: RFR: 8370813: Deprecate AggressiveHeap In-Reply-To: References: <_tMyItJZEwt4YJq9EYQuDBVIs_1l4jFPXhncYIHy2TE=.337902e6-ed5d-46e5-8344-d858407acbee@github.com> Message-ID: On Wed, 5 Nov 2025 10:32:04 GMT, Albert Mingkun Yang wrote: >> Hello, >> >> This RFE deprecates the `AggressiveHeap` flag in JDK 26. Please see the CSR for specific details on why this flag is being deprecated and workarounds for users interested in keeping similar behavior in the future. > > Marked as reviewed by ayang (Reviewer). Thank you for the reviews! @albertnetymk @shipilev ------------- PR Comment: https://git.openjdk.org/jdk/pull/28144#issuecomment-3510263882 From jsikstro at openjdk.org Mon Nov 10 09:05:46 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 10 Nov 2025 09:05:46 GMT Subject: RFR: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine [v2] In-Reply-To: References: Message-ID: > Hello, > > This RFE deprecates the `AlwaysActAsServerClassMachine` and `NeverActAsServrClassMachine` flags in JDK 26. Please see the CSR for specific details on why these flag are being deprecated. Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' into JDK-8370843_deprecate_always_never_actasserverclassmachine - 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine ------------- Changes: https://git.openjdk.org/jdk/pull/28148/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28148&range=01 Stats: 44 lines in 3 files changed: 22 ins; 20 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28148.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28148/head:pull/28148 PR: https://git.openjdk.org/jdk/pull/28148 From phubner at openjdk.org Mon Nov 10 09:16:05 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Mon, 10 Nov 2025 09:16:05 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 14:35:23 GMT, Paul H?bner wrote: >> Hi all, >> >> The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. >> >> For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. >> >> We observed the above in Valhalla and already patched it there. >> >> Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). > > Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: > > Don't make a new function. Thanks again for the reviews, everyone. I'll need a sponsor for this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28190#issuecomment-3510352540 From duke at openjdk.org Mon Nov 10 09:16:05 2025 From: duke at openjdk.org (duke) Date: Mon, 10 Nov 2025 09:16:05 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: <3jiX63LGisUWJq-NH0u_tMKucmDrcdj6Taa5CRd8IL4=.7def3c5d-7902-4f58-bc01-48f0c7862807@github.com> On Fri, 7 Nov 2025 14:35:23 GMT, Paul H?bner wrote: >> Hi all, >> >> The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. >> >> For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. >> >> We observed the above in Valhalla and already patched it there. >> >> Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). > > Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: > > Don't make a new function. @Arraying Your change (at version b5991b7f23a32a96bfc4a419beb28adb5c5e257e) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28190#issuecomment-3510354543 From shade at openjdk.org Mon Nov 10 09:26:06 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Nov 2025 09:26:06 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 16:41:26 GMT, Severin Gehwolf wrote: > Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. > > The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. > > Testing (all on Linux x86_64): > - [x] CG version 2, run as root. Engine: docker. Test is skipped. > - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [X] CG version 1, run as non-root. Test skipped. > - [x] GHA, though I don't think this is very useful for this change. > > Thoughts? Looks fine, a few nits. test/hotspot/jtreg/containers/docker/TestMemoryInvisibleParent.java line 96: > 94: opts.addDockerOpts("--cgroup-parent=/" + cgroupParent); > 95: Common.run(opts) > 96: .shouldContain("Hierarchical Memory Limit is: " + expectedValue); Indenting a a bit off here. test/hotspot/jtreg/containers/docker/TestMemoryInvisibleParent.java line 107: > 105: Path sysFsMemory = Path.of("/", "sys", "fs", "cgroup", "memory"); > 106: Path cgroupParentPath = sysFsMemory.resolve(cgroupParent); > 107: ProcessBuilder pb = new ProcessBuilder("mkdir", "-p", cgroupParentPath.toString()); So I am guessing we are fine with leaving this cgroup behind, after the test is done? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28201#pullrequestreview-3441857489 PR Review Comment: https://git.openjdk.org/jdk/pull/28201#discussion_r2509492642 PR Review Comment: https://git.openjdk.org/jdk/pull/28201#discussion_r2509498605 From phubner at openjdk.org Mon Nov 10 09:28:13 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Mon, 10 Nov 2025 09:28:13 GMT Subject: Integrated: 8371216: oopDesc::print_value_on breaks if klass is garbage In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 09:08:33 GMT, Paul H?bner wrote: > Hi all, > > The `oopDesc::print_value_on` function checks if an oop is a string, and if so just prints the raw string. To do this, it needs to read the `klass()`. If the `klass()` reads garbage, one of many assertion errors is likely triggered. > > For example, if G1's verification finds problematic oops, it will attempt to print them. If these oops have garbage (incorrect or racey) klasses, this will cause an assertion error, fail fast, and VM crash. G1 never finishes printing, which may make debugging more difficult. The developer can/will be made aware in other ways if the `klass()` is garbage, for example by being told that it is not in the metaspace. > > We observed the above in Valhalla and already patched it there. > > Testing: tiers 1-5 on Linux (x64, AArch64), macOS (x64, AArch64), Windows (x64). This pull request has now been integrated. Changeset: f48ad21e Author: Paul H?bner Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/f48ad21ecc288c280db3ffb2e098df12518e2a5a Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod 8371216: oopDesc::print_value_on breaks if klass is garbage Reviewed-by: coleenp, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/28190 From sgehwolf at openjdk.org Mon Nov 10 09:29:01 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 09:29:01 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 In-Reply-To: References: Message-ID: <0aHFgeT-Ikjw_Ad_evc1OoWZ97gBC-nD-_8UdoXiyO4=.ff3d18a7-35d1-4828-b8bb-3cd9db962283@github.com> On Mon, 10 Nov 2025 09:23:36 GMT, Aleksey Shipilev wrote: >> Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. >> >> The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. >> >> Testing (all on Linux x86_64): >> - [x] CG version 2, run as root. Engine: docker. Test is skipped. >> - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [X] CG version 1, run as non-root. Test skipped. >> - [x] GHA, though I don't think this is very useful for this change. >> >> Thoughts? > > test/hotspot/jtreg/containers/docker/TestMemoryInvisibleParent.java line 107: > >> 105: Path sysFsMemory = Path.of("/", "sys", "fs", "cgroup", "memory"); >> 106: Path cgroupParentPath = sysFsMemory.resolve(cgroupParent); >> 107: ProcessBuilder pb = new ProcessBuilder("mkdir", "-p", cgroupParentPath.toString()); > > So I am guessing we are fine with leaving this cgroup behind, after the test is done? Yes. Unfortunately it's not easy to remove that. However, the test itself resets it to "unlimited" by setting the limit to `-1`. So it shouldn't make a difference. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28201#discussion_r2509510367 From mdoerr at openjdk.org Mon Nov 10 09:34:18 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 10 Nov 2025 09:34:18 GMT Subject: RFR: 8371216: oopDesc::print_value_on breaks if klass is garbage [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 09:13:15 GMT, Paul H?bner wrote: >> Paul H?bner has updated the pull request incrementally with one additional commit since the last revision: >> >> Don't make a new function. > > Thanks again for the reviews, everyone. I'll need a sponsor for this. @Arraying: You can enable GitHub Actions in your fork. You will get automatic builds on many platforms and some tests for future PRs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28190#issuecomment-3510430302 From ayang at openjdk.org Mon Nov 10 09:45:15 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 10 Nov 2025 09:45:15 GMT Subject: RFR: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 09:05:46 GMT, Joel Sikstr?m wrote: >> Hello, >> >> This RFE deprecates the `AlwaysActAsServerClassMachine` and `NeverActAsServrClassMachine` flags in JDK 26. Please see the CSR for specific details on why these flag are being deprecated. > > Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge branch 'master' into JDK-8370843_deprecate_always_never_actasserverclassmachine > - 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28148#pullrequestreview-3441968715 From jsikstro at openjdk.org Mon Nov 10 09:45:16 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 10 Nov 2025 09:45:16 GMT Subject: RFR: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 09:38:28 GMT, Albert Mingkun Yang wrote: >> Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: >> >> - Merge branch 'master' into JDK-8370843_deprecate_always_never_actasserverclassmachine >> - 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine > > Marked as reviewed by ayang (Reviewer). Thank you for the reviews! @albertnetymk @vnkozlov ------------- PR Comment: https://git.openjdk.org/jdk/pull/28148#issuecomment-3510477707 From jsikstro at openjdk.org Mon Nov 10 09:45:16 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 10 Nov 2025 09:45:16 GMT Subject: Integrated: 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine In-Reply-To: References: Message-ID: <50wP2NX7o10fHbIP0L1Cj49wHI3O2pt1KAb6pliw6YU=.dec4bfc6-9547-49d6-86ba-a81e03a75f2e@github.com> On Wed, 5 Nov 2025 10:16:40 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE deprecates the `AlwaysActAsServerClassMachine` and `NeverActAsServrClassMachine` flags in JDK 26. Please see the CSR for specific details on why these flag are being deprecated. This pull request has now been integrated. Changeset: c0b82ff2 Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/c0b82ff2e5b696371de62e0f4fcbba61361fc6b2 Stats: 44 lines in 3 files changed: 22 ins; 20 del; 2 mod 8370843: Deprecate AlwaysActAsServerClassMachine and NeverActAsServerClassMachine Reviewed-by: ayang, kvn ------------- PR: https://git.openjdk.org/jdk/pull/28148 From snazarki at openjdk.org Mon Nov 10 09:48:08 2025 From: snazarki at openjdk.org (Sergey Nazarkin) Date: Mon, 10 Nov 2025 09:48:08 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values In-Reply-To: References: Message-ID: <1e0DoYAYPXsbdJ2Y91T1wviv_vDQkusrymvXrQmFFlg=.4c01e673-924a-496f-9bd3-94e1955170ab@github.com> On Tue, 29 Jul 2025 06:46:56 GMT, Ivan wrote: > Migrate away from pointer-based representation of Register values. > > It improves compile-time checking by forbidding implicit conversions between integrals and pointers. > > [JDK-8363943](https://bugs.openjdk.org/browse/JDK-8363943) @bobvandette @shipilev Could you please review these changes? They are critical for the ARM32 port, which crashes if it is built with one of the latest versions of GCC. @bulasevich you might be interested in this too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26525#issuecomment-3510493435 PR Comment: https://git.openjdk.org/jdk/pull/26525#issuecomment-3510499550 From alanb at openjdk.org Mon Nov 10 10:06:31 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 10 Nov 2025 10:06:31 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: > Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). > > Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. > > HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). > > There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. > > Testing: tier1-6 Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: - Merge branch 'master' into JDK-8353835 - Fix typo in test comment - Merge branch 'master' into JDK-8353835 - Merge branch 'master' into JDK-8353835 - Suppress warnings from some tests - Change -Xcheck:jni to be warning rather than fatal error - Merge branch 'master' into JDK-8353835 - Simplify filter - Merge branch 'master' into JDK-8353835 - Update Xcheck:jni description - ... and 35 more: https://git.openjdk.org/jdk/compare/066810c8...6671ae02 ------------- Changes: https://git.openjdk.org/jdk/pull/25115/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25115&range=09 Stats: 4852 lines in 70 files changed: 4667 ins; 54 del; 131 mod Patch: https://git.openjdk.org/jdk/pull/25115.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25115/head:pull/25115 PR: https://git.openjdk.org/jdk/pull/25115 From thartmann at openjdk.org Mon Nov 10 10:16:06 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 10 Nov 2025 10:16:06 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: <7LtvXdcfrqOlHXnsZlsA5Kvf2RJXVt1I8IoVYz7Uj90=.eda5d489-7fbf-4051-88be-fd3c183761f6@github.com> On Fri, 7 Nov 2025 12:00:23 GMT, Tobias Hartmann wrote: > Sure, I submitted testing. All testing passed. I leave it to the original reviewers to re-review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3510638686 From vyazici at openjdk.org Mon Nov 10 10:18:10 2025 From: vyazici at openjdk.org (Volkan Yazici) Date: Mon, 10 Nov 2025 10:18:10 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 10:06:31 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: > > - Merge branch 'master' into JDK-8353835 > - Fix typo in test comment > - Merge branch 'master' into JDK-8353835 > - Merge branch 'master' into JDK-8353835 > - Suppress warnings from some tests > - Change -Xcheck:jni to be warning rather than fatal error > - Merge branch 'master' into JDK-8353835 > - Simplify filter > - Merge branch 'master' into JDK-8353835 > - Update Xcheck:jni description > - ... and 35 more: https://git.openjdk.org/jdk/compare/066810c8...6671ae02 Marked as reviewed by vyazici (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25115#pullrequestreview-3442211078 From egahlin at openjdk.org Mon Nov 10 10:36:04 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 10 Nov 2025 10:36:04 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v6] In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 01:59:41 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > rename. start/end time This looks better, but I think activeElapsed can be removed since we now have it in duration. I'm not sure idle should be included, unless it is believed to be important. activeElapsed = 0.00124 ms processElapsed = 0.00100 ms idleElapsed = 0.000780 ms resizeTableElapsed = 0 s cleanupTableElapsed = 0 s An argument can be made that the phases should be separate events, similar to CompilerPhase and GCPausePhase, where you have a name for each phase (String Processing, Table Resize and Table Cleanup), but it may be over-engineering if we don't believe these phases will change in the future? The suffix "Elapsed" is not something we have used for describing a timespan. I wonder if the fields should be: processing tableResize tableCleanup ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3510740806 From mdoerr at openjdk.org Mon Nov 10 10:54:07 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 10 Nov 2025 10:54:07 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 11:07:40 GMT, Ruben wrote: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. I haven't checked the deopt handler size on x86. Otherwise, it still looks good to me. src/hotspot/cpu/x86/nativeInst_x86.hpp line 585: > 583: }; > 584: > 585: bool check() const { return short_at(0) == 0x1f0f && short_at(2) == 0x0084; } Maybe a comment would be nice. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28192#pullrequestreview-3442431963 PR Review Comment: https://git.openjdk.org/jdk/pull/28192#discussion_r2509917846 From mdoerr at openjdk.org Mon Nov 10 11:09:44 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 10 Nov 2025 11:09:44 GMT Subject: RFR: 8370244: [PPC64] Several vector tests fail on Power8 Message-ID: This is a workaround for [JDK-8370803](https://bugs.openjdk.org/browse/JDK-8370803). Power8 uses `MaxVectorSize`=8 by default. All tests are passing with `EnableVectorSupport` disabled. ------------- Commit messages: - 8370244: [PPC64] Several vector tests fail on Power8 Changes: https://git.openjdk.org/jdk/pull/28214/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28214&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370244 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28214.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28214/head:pull/28214 PR: https://git.openjdk.org/jdk/pull/28214 From shade at openjdk.org Mon Nov 10 13:00:19 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Nov 2025 13:00:19 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values In-Reply-To: References: Message-ID: On Tue, 29 Jul 2025 06:46:56 GMT, Ivan wrote: > Migrate away from pointer-based representation of Register values. > > It improves compile-time checking by forbidding implicit conversions between integrals and pointers. > > [JDK-8363943](https://bugs.openjdk.org/browse/JDK-8363943) Good work, looks pretty reasonable. Questions/comments/nits: src/hotspot/cpu/arm/register_arm.hpp line 86: > 84: enum { > 85: number_of_registers = 16, > 86: max_slots_per_register = 1 << (LogBytesPerWord - LogBytesPerInt) // LogBytesPerWord depends on _LP64 ARM32 is only 32-bit, so we can skip any _LP64-based computations, and just do the literal constant. src/hotspot/cpu/arm/register_arm.hpp line 101: > 99: > 100: // testers > 101: bool is_valid() const {return 0 <= raw_encoding() && raw_encoding() < number_of_registers;} Suggestion: // accessors and testers int raw_encoding() const { return this - first(); } int encoding() const { assert(is_valid(), "invalid register"); return raw_encoding(); } bool is_valid() const { return 0 <= raw_encoding() && raw_encoding() < number_of_registers; } src/hotspot/cpu/arm/register_arm.hpp line 152: > 150: constexpr Register R9 = as_Register(9); > 151: constexpr Register R10 = as_Register(10); > 152: constexpr Register R11 = as_Register(11); Indent these like: constexpr Register R8 = as_Register( 8); constexpr Register R9 = as_Register( 9); constexpr Register R10 = as_Register(10); constexpr Register R11 = as_Register(11); src/hotspot/cpu/arm/register_arm.hpp line 187: > 185: enum { > 186: number_of_registers = NOT_COMPILER2(32) COMPILER2_PRESENT(64), > 187: max_slots_per_register = 1 Can you double-check it is really `1`? For GPRs, we have `max_slots_per_register` at effectively `2`. src/hotspot/cpu/arm/register_arm.hpp line 202: > 200: > 201: // testers > 202: bool is_valid() const {return 0 <= raw_encoding() && raw_encoding() < number_of_registers;} Suggestion: // accessors and testers int raw_encoding() const { return this - first(); } int encoding() const { assert(is_valid(), "invalid register"); return raw_encoding(); } bool is_valid() const { return 0 <= raw_encoding() && raw_encoding() < number_of_registers; } src/hotspot/cpu/arm/register_arm.hpp line 264: > 262: constexpr FloatRegister S4_reg = as_FloatRegister(4); > 263: constexpr FloatRegister S5_reg = as_FloatRegister(5); > 264: constexpr FloatRegister S6_reg = as_FloatRegister(6); Take a chance on renaming these `S${X}_reg` to just `S${X}`? I spot-checked their usages, and there are only a few places that need adjustments. src/hotspot/cpu/arm/register_arm.hpp line 265: > 263: constexpr FloatRegister S5_reg = as_FloatRegister(5); > 264: constexpr FloatRegister S6_reg = as_FloatRegister(6); > 265: constexpr FloatRegister S7 = as_FloatRegister(7); Also, indent this like: constexpr FloatRegister S8 = as_FloatRegister( 8); constexpr FloatRegister S9 = as_FloatRegister( 9); constexpr FloatRegister S10 = as_FloatRegister(10); constexpr FloatRegister S11 = as_FloatRegister(11); src/hotspot/cpu/arm/register_arm.hpp line 427: > 425: constexpr VFPSystemRegister FPSCR = as_VFPSystemRegister( 1); > 426: constexpr VFPSystemRegister MVFR0 = as_VFPSystemRegister(0x6); > 427: constexpr VFPSystemRegister MVFR1 = as_VFPSystemRegister(0x7); You can use `VFPSystemRegister` enum values as arguments here, correct? Like: constexpr VFPSystemRegister MVFR1 = as_VFPSystemRegister(VFPSystemRegister::MVFR1); src/hotspot/cpu/arm/vmreg_arm.hpp line 52: > 50: return (value() % Register::max_slots_per_register == 0); > 51: } else if (is_FloatRegister()) { > 52: return true; // Single slot I guess. But for safety, we can still do `% FloatRegister::max_slot_per_register == 0`, just in case we ever need to adjust it? ------------- PR Review: https://git.openjdk.org/jdk/pull/26525#pullrequestreview-3442536328 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510005169 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510013454 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510042736 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510467475 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510025802 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510036921 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510039259 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510477885 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2510485666 From sgehwolf at openjdk.org Mon Nov 10 13:10:09 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:10:09 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 14:13:57 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 627: > >> 625: * >> 626: * If quotas have not been specified, return the >> 627: * number of active processors in the system. > > This paragraph uses the "return" language that you adjusted in the next paragraph. It should probably also refer to the reference argument instead. Thanks, fixed. > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 629: > >> 627: * number of active processors in the system. >> 628: * >> 629: * If quotas have been specified, the resulting number > > Tiny nit, but "the resulting number" => "the number", since you say "the result reference" on the next line. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510514731 PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510515207 From sgehwolf at openjdk.org Mon Nov 10 13:13:05 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:13:05 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 09:50:33 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 638: > >> 636: bool CgroupSubsystem::active_processor_count(int& value) { >> 637: int cpu_count; >> 638: int result = -1; > > Why not get rid of result and use `value` throughout like you did in the cached case? It's useful to do assertions on the value retrieved by `CgroupUtil::processor_count()` before the actual result is being changed. Using `value` has the issue of not knowing what the reference default value was set to. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510522176 From adinn at openjdk.org Mon Nov 10 13:19:08 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 10 Nov 2025 13:19:08 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages Looks good to me. @Harshit470250 You need another reviewer before you can push this. Perhaps @dean-long can help -- he reviewed the earlier commit which led to this one being created. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27279#pullrequestreview-3443199481 PR Comment: https://git.openjdk.org/jdk/pull/27279#issuecomment-3511586105 From sgehwolf at openjdk.org Mon Nov 10 13:19:07 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:19:07 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 14:19:19 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 653: > >> 651: cpu_count = os::Linux::active_processor_count(); >> 652: if (!CgroupUtil::processor_count(contrl->controller(), cpu_count, result)) { >> 653: return false; > > `value` will be returned unchanged from its passed-in value here. I wonder if it would be safer to explicitly set it to `0` when returning `false`. Also, could `value` be given an unsigned type, like `uint64_t`? The general contract in those functions is that the result reference is unchanged when `false` is being returned. So this is intentional. > Also, could value be given an unsigned type, like uint64_t I've tried to keep the `int` based processor_count API as is. Not sure if we need an unsigned type here. We could if that's the consensus, but then it would make the patch even larger. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510540986 From sgehwolf at openjdk.org Mon Nov 10 13:24:04 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:24:04 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 14:29:11 GMT, Thomas Fitzsimmons wrote: > I think quote value_unlimited here to hint that it is a constant defined elsewhere. OK. > Can the limit ever be 0, and if not, should there be a new assert for > 0 like for cpu_count? The limit could theoretically be `0`. I'd try to avoid an overzealous assert here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510555699 From sgehwolf at openjdk.org Mon Nov 10 13:30:18 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:30:18 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 11:22:01 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 80: > >> 78: return false; \ >> 79: } \ >> 80: log_trace(os, container)(log_string " is: " UINT64_FORMAT, retval); \ > > Here and in other places: don't use raw UINT64_FORMAT; use `PHYS_MEM_TYPE_FORMAT` instead. This is intentional since the processor_count API doesn't use `physical_memory_size_type` (as it doesn't make sense in this context). See, for example, `CgroupV2CpuController::cpu_period()`. The common denominator is `uint64_t`. This is a bit awkward, but I don't know a better way to deal with this. The reading functions are shared, most of the API is used for memory value reading (but not exclusively, exceptions are `pid`, `cpu`). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510577587 From adinn at openjdk.org Mon Nov 10 13:32:18 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 10 Nov 2025 13:32:18 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> References: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> Message-ID: <-Ehjjc2u5iw9WxmX0Bsf34IK6kGIm3xitYrHDetYG_U=.29e9f5f2-a9b3-44c5-9cb7-30feb4f4ccff@github.com> On Fri, 7 Nov 2025 16:22:39 GMT, Ashutosh Mehra wrote: >> The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. >> In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Add comments > > Signed-off-by: Ashutosh Mehra Looks good src/hotspot/share/classfile/compactHashtable.hpp line 297: > 295: } > 296: > 297: // Iterate through the values in the table, stopping when do_value() return false. Suggestion: // Iterate through the values in the table, stopping when do_value() return false. // Iterate through the values in the table, stopping when iter->do_value() returns false. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28197#pullrequestreview-3443265916 PR Review Comment: https://git.openjdk.org/jdk/pull/28197#discussion_r2510575719 From sgehwolf at openjdk.org Mon Nov 10 13:34:21 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:34:21 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 12:03:26 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 90: > >> 88: if (!is_ok) { \ >> 89: log_trace(os, container)(log_string " failed: -2"); \ >> 90: return false; \ > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? We don't need the `-2` here. This was an attempt to keep backwards compatible, but I guess we can change testing code as well (at least those that rely on those trace logs). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510588965 From sgehwolf at openjdk.org Mon Nov 10 13:42:15 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:42:15 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 12:04:51 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 93: > >> 91: } \ >> 92: if (retval == value_unlimited) { \ >> 93: log_trace(os, container)(log_string " is: -1"); \ > > Same here, could perhaps do `log_trace(os, container)(log_string " is: unlimited")`instead. OK. This will likely need some test adjustment, but I'll do that instead of hard-coding those numbers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510609021 From sgehwolf at openjdk.org Mon Nov 10 13:42:18 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:42:18 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 11:23:05 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 105: > >> 103: is_ok = controller->read_string(filename, retval, buf_size); \ >> 104: if (!is_ok) { \ >> 105: log_trace(os, container)(log_string " failed: -2"); \ > > Why this change? Did the constant value change? Motivation was getting rid of the OSCONTAINER_ERROR constant. The only place where a negative number was still in use. I've just dropped the `: -2` suffix now. It's not very useful (other than in tests). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510615109 From tschatzl at openjdk.org Mon Nov 10 13:47:20 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 10 Nov 2025 13:47:20 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA Looks good. Thanks for cleaning this up. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28146#pullrequestreview-3443342641 From sgehwolf at openjdk.org Mon Nov 10 13:50:26 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:50:26 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <5Ossha9mznuIOp64P8MfLZaLaubRFuaVH1jGQEu6Hb0=.82d5744d-3d56-4ac6-8b19-c9664717069f@github.com> On Mon, 27 Oct 2025 19:28:08 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 167: > >> 165: /* memory_and_swap_limit_in_bytes >> 166: * >> 167: * Determine the memory and swap limit metric. Returns a positive limit value or > > "Returns" language should probably be updated here too. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510646920 From sgehwolf at openjdk.org Mon Nov 10 13:50:29 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:50:29 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Tue, 28 Oct 2025 09:26:09 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 465: > >> 463: // negative value as a large unsiged int >> 464: if (!reader()->read_number("/cpu.cfs_quota_us", quota)) { >> 465: log_trace(os, container)("CPU Quota failed: -2"); > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? I've dropped `: -2` suffix now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510649413 From sgehwolf at openjdk.org Mon Nov 10 13:55:39 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:55:39 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 19:48:58 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 470: > >> 468: // cast to int since the read value might be negative >> 469: // and we want to avoid logging -1 as a large unsigned value. >> 470: int quota_int = static_cast(quota); > > It seems like quota is either a positive number or disabled. I wonder if `result` can be treated as a `uint64_t`, and this log message special-cased to detect `-1` read from `/cpu.cfs_quota_us` as disabled. I guess the calling code would need another way to differentiate "disabled" from other values... maybe with `0`? Just a thought to maybe simplify the type logic here. Likewise for `period` and `shares`. Yes, there is opportunity to change the API. This patch was done to do a 1-to-1 translation of the previous version as much as possible. So I've refrained from doing this in this patch as well. It kept the size of the patch a bit more manageable. Happy to file a follow-up RFE to do this in a separate patch. Thoughts? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510666794 From sgehwolf at openjdk.org Mon Nov 10 14:09:54 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:09:54 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 11:32:34 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 61: > >> 59: * true if the result reference got updated >> 60: * false if there was an error >> 61: */ > > We set result to `-1` and return true on a no share setup here, but return `false` and don't on cgroup v1. The comment is contradicting. Good catch. Fixed the cgroup v1 code to match the old behaviour (set `-1` in the result reference and return `true` if we read the default value). I think this fixes the issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510712320 From sgehwolf at openjdk.org Mon Nov 10 14:13:56 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:13:56 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 11:43:34 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 178: > >> 176: bool is_ok = reader()->read_numerical_key_value("/cpu.stat", "usage_usec", value); >> 177: if (!is_ok) { >> 178: log_trace(os, container)("CPU Usage failed: -2"); > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? Thanks. I've removed the `-2`. > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 237: > >> 235: if (!reader()->read_number_handle_max("/memory.swap.max", swap_limit_val)) { >> 236: // Some container tests rely on this trace logging to happen. >> 237: log_trace(os, container)("Swap Limit failed: -2"); > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? I've removed the `-2`. I.e. `Swap Limit failed` is the log message. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510722882 PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510726448 From dbriemann at openjdk.org Mon Nov 10 14:16:00 2025 From: dbriemann at openjdk.org (David Briemann) Date: Mon, 10 Nov 2025 14:16:00 GMT Subject: RFR: 8370244: [PPC64] Several vector tests fail on Power8 In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 11:01:33 GMT, Martin Doerr wrote: > This is a workaround for [JDK-8370803](https://bugs.openjdk.org/browse/JDK-8370803). Power8 uses `MaxVectorSize`=8 by default. All tests are passing with `EnableVectorSupport` disabled. Lgtm. Thanks! ------------- Marked as reviewed by dbriemann (Committer). PR Review: https://git.openjdk.org/jdk/pull/28214#pullrequestreview-3443471272 From fitzsim at openjdk.org Mon Nov 10 14:27:50 2025 From: fitzsim at openjdk.org (Thomas Fitzsimmons) Date: Mon, 10 Nov 2025 14:27:50 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <4ECXYwsoJz7nYkDcPFo6R2y7L56KufBRH7ox-7_Proo=.adddc389-8462-4872-8bb8-00bbfa10e2ef@github.com> On Mon, 10 Nov 2025 13:53:11 GMT, Severin Gehwolf wrote: >> src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 470: >> >>> 468: // cast to int since the read value might be negative >>> 469: // and we want to avoid logging -1 as a large unsigned value. >>> 470: int quota_int = static_cast(quota); >> >> It seems like quota is either a positive number or disabled. I wonder if `result` can be treated as a `uint64_t`, and this log message special-cased to detect `-1` read from `/cpu.cfs_quota_us` as disabled. I guess the calling code would need another way to differentiate "disabled" from other values... maybe with `0`? Just a thought to maybe simplify the type logic here. Likewise for `period` and `shares`. > > Yes, there is opportunity to change the API. This patch was done to do a 1-to-1 translation of the previous version as much as possible. So I've refrained from doing this in this patch as well. It kept the size of the patch a bit more manageable. Happy to file a follow-up RFE to do this in a separate patch. Thoughts? Maybe the API as-is is clearer, because it matches the actual `/proc` values. Having thought about it more, it probably doesn't make sense to change the API just to make the implementation's type handling cleaner, so I'd say don't bother with the follow-up RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510770596 From duke at openjdk.org Mon Nov 10 14:27:52 2025 From: duke at openjdk.org (Ruben) Date: Mon, 10 Nov 2025 14:27:52 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: <7LtvXdcfrqOlHXnsZlsA5Kvf2RJXVt1I8IoVYz7Uj90=.eda5d489-7fbf-4051-88be-fd3c183761f6@github.com> References: <7LtvXdcfrqOlHXnsZlsA5Kvf2RJXVt1I8IoVYz7Uj90=.eda5d489-7fbf-4051-88be-fd3c183761f6@github.com> Message-ID: On Mon, 10 Nov 2025 10:13:20 GMT, Tobias Hartmann wrote: >> Sure, I submitted testing. > >> Sure, I submitted testing. > > All testing passed. I leave it to the original reviewers to re-review. Thanks, @TobiHartmann, for the extra tests, and apologies for missing this in the original PR. @TheRealMDoerr, I will update the PR to add comments. I am exploring a possibility to add a unit test for this issue, but I have not identified a deterministic way to make the deopt handler stub end at a page boundary. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3511994160 From ayang at openjdk.org Mon Nov 10 14:32:20 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 10 Nov 2025 14:32:20 GMT Subject: Integrated: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: <3MvmZGPFXA7cqYhLs44MVY_P3BFyuUh4y5mlSWbGUxA=.d2d98f00-d991-43cd-8dd1-7d5d96c4031c@github.com> On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA This pull request has now been integrated. Changeset: 9d2fa8fe Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/9d2fa8fe22652cbf1c70b953247bd154b363b383 Stats: 38 lines in 16 files changed: 0 ins; 6 del; 32 mod 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue Reviewed-by: fandreuzzi, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/28146 From ayang at openjdk.org Mon Nov 10 14:29:56 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 10 Nov 2025 14:29:56 GMT Subject: RFR: 8371321: Remove unused last arg of BarrierSetAssembler::arraycopy_epilogue In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 10:10:02 GMT, Albert Mingkun Yang wrote: > Removing effectively dead code. > > Test: tier1, GHA Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28146#issuecomment-3512011561 From sgehwolf at openjdk.org Mon Nov 10 14:41:47 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:41:47 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 13:49:03 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 379: > >> 377: * Calculate the maximum number of tasks available to the process. Set the >> 378: * value in the passed in 'value' reference. The value might be -1 when >> 379: * there is no limit. > > How can we get `-1`? Or do you mean `(uint64_t)-1`? This was meant to say `value_unlimited` if there is `max` in the `pids.max` interface file. Updated the comment and changed the code handling for `VM.info` to handle `value_unlimited`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510823182 From asmehra at openjdk.org Mon Nov 10 14:53:36 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 14:53:36 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v3] In-Reply-To: References: Message-ID: > The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. > In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Update comments Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28197/files - new: https://git.openjdk.org/jdk/pull/28197/files/f46d3dae..94cab725 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28197&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28197&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28197.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28197/head:pull/28197 PR: https://git.openjdk.org/jdk/pull/28197 From asmehra at openjdk.org Mon Nov 10 14:53:39 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 14:53:39 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: <-Ehjjc2u5iw9WxmX0Bsf34IK6kGIm3xitYrHDetYG_U=.29e9f5f2-a9b3-44c5-9cb7-30feb4f4ccff@github.com> References: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> <-Ehjjc2u5iw9WxmX0Bsf34IK6kGIm3xitYrHDetYG_U=.29e9f5f2-a9b3-44c5-9cb7-30feb4f4ccff@github.com> Message-ID: On Mon, 10 Nov 2025 13:29:10 GMT, Andrew Dinn wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Add comments >> >> Signed-off-by: Ashutosh Mehra > > Looks good @adinn thanks for the review. I will fix the comment as suggested. > src/hotspot/share/classfile/compactHashtable.hpp line 297: > >> 295: } >> 296: >> 297: // Iterate through the values in the table, stopping when do_value() return false. > > Suggestion: > > // Iterate through the values in the table, stopping when do_value() return false. > > // Iterate through the values in the table, stopping when iter->do_value() returns false. Done ------------- PR Comment: https://git.openjdk.org/jdk/pull/28197#issuecomment-3512165594 PR Review Comment: https://git.openjdk.org/jdk/pull/28197#discussion_r2510861828 From sgehwolf at openjdk.org Mon Nov 10 14:55:14 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:55:14 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 13:09:36 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/os_linux.cpp line 220: > >> 218: if (OSContainer::is_containerized() && OSContainer::available_memory_in_bytes(avail_mem)) { >> 219: log_trace(os)("available container memory: " PHYS_MEM_TYPE_FORMAT, avail_mem); >> 220: value = avail_mem; > > Should be able to pass in `value` directly instead of using `avail_mem`. Sure, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510872717 From sgehwolf at openjdk.org Mon Nov 10 14:59:35 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:59:35 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 13:11:49 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/os_linux.cpp line 261: > >> 259: if (OSContainer::is_containerized() && OSContainer::available_memory_in_bytes(free_mem)) { >> 260: log_trace(os)("free container memory: " PHYS_MEM_TYPE_FORMAT, free_mem); >> 261: value = free_mem; > > Should be able to pass in `value` directly instead of using `free_mem`. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510884240 From asmehra at openjdk.org Mon Nov 10 15:07:03 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 15:07:03 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 07:24:50 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Add Ioi Lam's comment src/hotspot/share/oops/resolvedFieldEntry.cpp line 31: > 29: STATIC_ASSERT(std::is_trivially_copyable_v == true); > 30: > 31: // Detect inadvertently introduced trailing padding. Another way to detect padding (trailing or internal) could be to compare sizeof(ResolvedFieldEntry) against the sum up the size of all the elements. Something like: `sizeofResolvedFieldEntry) == (sizeof(InstanceKlass*) + sizeof(int) + sizeof(u2) + ... + sizeof(_padding))` It looks cumbersome but I think it is easy enough to update if a new field is added because the static assert will fail immediately. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2510917443 From sgehwolf at openjdk.org Mon Nov 10 15:21:13 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 15:21:13 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 14:09:48 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/os_linux.cpp line 4863: > >> 4861: if (OSContainer::is_containerized() && OSContainer::active_processor_count(active_cpus)) { >> 4862: log_trace(os)("active_processor_count: determined by OSContainer: %d", >> 4863: active_cpus); > > When running containerized, we would now always fetch the os cpu count at least once. > > `CgroupSubsystem::active_processor_count`, which this calls down to has the cache to actively avoid getting the cpu count too frequently, and only gets the number of cpus with `os::Linux::active_processor_count` when the cache expires. > > I don't know if this is still an issue today, but since it's there I still think we should avoid getting the cpus if unnecessary. Thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510972683 From sgehwolf at openjdk.org Mon Nov 10 15:24:21 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 15:24:21 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 12:21:10 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/share/runtime/os.cpp line 2215: > >> 2213: } >> 2214: value = mem_usage; >> 2215: return true; > > Can we collapse this and just set the `value` reference directly instead in the container functions? Something like: > > ```c++ > if (OSContainer::is_containerized()) { > return OSContainer::memory_usage_in_bytes(mem_usage); > } Sure. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510986032 From kvn at openjdk.org Mon Nov 10 16:56:27 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 10 Nov 2025 16:56:27 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v3] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 14:53:36 GMT, Ashutosh Mehra wrote: >> The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. >> In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Update comments > > Signed-off-by: Ashutosh Mehra My testing passed. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28197#pullrequestreview-3444223711 From sgehwolf at openjdk.org Mon Nov 10 17:15:49 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:49 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v3] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Extract OSContainer::available_swap_in_bytes() - Simplify os::used_memory() - Fix os::active_processor_count() - os::free_memory => use 'value' directly - os::available_memory() => use 'value' directly - Fix pids_max printing in VM.info - Better logging for -1 (cpu_shares) - Fix cg v1 cpu_shares to match old behaviour - More comment fixes. - Drop -1 (unlimited) and -2 (failed) constants Will likely need corresponding test changes - ... and 11 more: https://git.openjdk.org/jdk/compare/d5803aa7...08f1c185 ------------- Changes: https://git.openjdk.org/jdk/pull/27743/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=02 Stats: 1307 lines in 16 files changed: 514 ins; 106 del; 687 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From sgehwolf at openjdk.org Mon Nov 10 17:15:50 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:50 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: <4ECXYwsoJz7nYkDcPFo6R2y7L56KufBRH7ox-7_Proo=.adddc389-8462-4872-8bb8-00bbfa10e2ef@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <4ECXYwsoJz7nYkDcPFo6R2y7L56KufBRH7ox-7_Proo=.adddc389-8462-4872-8bb8-00bbfa10e2ef@github.com> Message-ID: On Mon, 10 Nov 2025 14:24:48 GMT, Thomas Fitzsimmons wrote: >> Yes, there is opportunity to change the API. This patch was done to do a 1-to-1 translation of the previous version as much as possible. So I've refrained from doing this in this patch as well. It kept the size of the patch a bit more manageable. Happy to file a follow-up RFE to do this in a separate patch. Thoughts? > > Maybe the API as-is is clearer, because it matches the actual `/proc` values. Having thought about it more, it probably doesn't make sense to change the API just to make the implementation's type handling cleaner, so I'd say don't bother with the follow-up RFE. OK. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511346360 From sgehwolf at openjdk.org Mon Nov 10 17:15:52 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:52 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v3] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 13:44:56 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: >> >> - Extract OSContainer::available_swap_in_bytes() >> - Simplify os::used_memory() >> - Fix os::active_processor_count() >> - os::free_memory => use 'value' directly >> - os::available_memory() => use 'value' directly >> - Fix pids_max printing in VM.info >> - Better logging for -1 (cpu_shares) >> - Fix cg v1 cpu_shares to match old behaviour >> - More comment fixes. >> - Drop -1 (unlimited) and -2 (failed) constants >> >> Will likely need corresponding test changes >> - ... and 11 more: https://git.openjdk.org/jdk/compare/d5803aa7...08f1c185 > > src/hotspot/os/linux/os_linux.cpp line 348: > >> 346: return true; >> 347: } >> 348: } > > This whole function is getting a bit too long in my opinion. > Maybe everything inside the `if OSContainer::is_containerized() {}` could be moved into a new function `OSContainer::available_swap_in_bytes`, similar to the already existing `OSContainer::available_memory_in_bytes`. That way, we could abstract away all the `OSContainer` calls. > > The only consequence would be that the `log_trace` wouldn't work any more. I couldn't find any test that depends on the exact output, so it could perhaps be split up instead. I've moved this to a `OSContainer::available_swap_in_bytes` function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511339099 From sgehwolf at openjdk.org Mon Nov 10 17:15:53 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:53 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v3] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <0sWmQikqXAA1s_F26YTx5TkMBXJoww2FWxkBxdJKZfg=.915e4d84-b8c0-48c4-961c-0ffb3f1c396a@github.com> On Mon, 10 Nov 2025 17:09:54 GMT, Severin Gehwolf wrote: >> src/hotspot/os/linux/os_linux.cpp line 348: >> >>> 346: return true; >>> 347: } >>> 348: } >> >> This whole function is getting a bit too long in my opinion. >> Maybe everything inside the `if OSContainer::is_containerized() {}` could be moved into a new function `OSContainer::available_swap_in_bytes`, similar to the already existing `OSContainer::available_memory_in_bytes`. That way, we could abstract away all the `OSContainer` calls. >> >> The only consequence would be that the `log_trace` wouldn't work any more. I couldn't find any test that depends on the exact output, so it could perhaps be split up instead. > > I've moved this to a `OSContainer::available_swap_in_bytes` function. Example trace log (if it fails) is: [0.672s][trace][os,container] OSContainer::available_swap_in_bytes: container_swap_limit=unlimited container_mem_limit=1073741824, host_free_swap: 8589844480 [0.672s][trace][os,container] os::free_swap_space: containerized value unavailable returning host value: 8589844480 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511340779 From sgehwolf at openjdk.org Mon Nov 10 17:22:34 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:22:34 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: One more comment fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27743/files - new: https://git.openjdk.org/jdk/pull/27743/files/08f1c185..46df71e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From sgehwolf at openjdk.org Mon Nov 10 17:22:39 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:22:39 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 11:36:39 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 115: > >> 113: * true if the result reference has been set >> 114: * false on error >> 115: */ > > The beginning part of the comment isn't updated to mention the `result` reference, unlike the other comments. Should be fixed in https://github.com/openjdk/jdk/pull/27743/commits/46df71e19458b1682d6b8a28ef5b3e9a8932be9e ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511363227 From sgehwolf at openjdk.org Mon Nov 10 17:28:21 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:28:21 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Mon, 10 Nov 2025 17:22:34 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > One more comment fix I've resolved the conflicts now and incorporated reviewers' feedback. Thanks for the reviews! It doesn't solve the larger issue of reference passing for the result value, though :-/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3513007587 From sgehwolf at openjdk.org Mon Nov 10 17:43:03 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:43:03 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 16:41:26 GMT, Severin Gehwolf wrote: > Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. > > The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. > > Testing (all on Linux x86_64): > - [x] CG version 2, run as root. Engine: docker. Test is skipped. > - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [X] CG version 1, run as non-root. Test skipped. > - [x] GHA, though I don't think this is very useful for this change. > > Thoughts? Thanks for the reviews. Updated the patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28201#issuecomment-3513094003 From sgehwolf at openjdk.org Mon Nov 10 17:43:02 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:43:02 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 [v2] In-Reply-To: References: Message-ID: > Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. > > The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. > > Testing (all on Linux x86_64): > - [x] CG version 2, run as root. Engine: docker. Test is skipped. > - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [X] CG version 1, run as non-root. Test skipped. > - [x] GHA, though I don't think this is very useful for this change. > > Thoughts? Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: Indenting and copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28201/files - new: https://git.openjdk.org/jdk/pull/28201/files/86f8e1b8..faeff2a1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28201&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28201&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28201.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28201/head:pull/28201 PR: https://git.openjdk.org/jdk/pull/28201 From sgehwolf at openjdk.org Mon Nov 10 17:43:05 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:43:05 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 [v2] In-Reply-To: References: Message-ID: <1qZYZuODZnsxbV1e3S7_xemZyrGNBwtBuUns2K1hJrU=.c0282af9-4ed8-4b52-82f9-6b4f62b41ae8@github.com> On Mon, 10 Nov 2025 06:51:01 GMT, SendaoYan wrote: >> Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: >> >> Indenting and copyright > > test/hotspot/jtreg/containers/docker/TestMemoryInvisibleParent.java line 2: > >> 1: /* >> 2: * Copyright (C) 2025, IBM > > Copyright (c) 2025 IBM Corporation. All rights reserved. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28201#discussion_r2511417869 From shade at openjdk.org Mon Nov 10 17:45:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Nov 2025 17:45:17 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 17:43:02 GMT, Severin Gehwolf wrote: >> Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. >> >> The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. >> >> Testing (all on Linux x86_64): >> - [x] CG version 2, run as root. Engine: docker. Test is skipped. >> - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [X] CG version 1, run as non-root. Test skipped. >> - [x] GHA, though I don't think this is very useful for this change. >> >> Thoughts? > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > Indenting and copyright Still good. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28201#pullrequestreview-3444430112 From asmehra at openjdk.org Mon Nov 10 18:24:23 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 18:24:23 GMT Subject: RFR: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly [v2] In-Reply-To: References: <0Qe86N42BrG0cQ0dA1940SRqpp-0DD68UHYD3Gz3YAg=.d6bcb486-5c7b-4738-a74d-b34c9be2a70c@github.com> Message-ID: On Fri, 7 Nov 2025 22:35:15 GMT, Vladimir Kozlov wrote: >> @ashu-mehra, do you know what issue current code (before these changes) could cause? > >> @vnkozlov fyi - I also opened https://bugs.openjdk.org/browse/JDK-8371493 which is going to touch the same code as this patch. I didn't includes the changes in this patch to make it easier to backport this patch if needed. > > Good. We usually don't port enhancement but we can consider it since its simplification of printing code which should not affect code execution. @vnkozlov thanks for the review and testing it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28197#issuecomment-3513278045 From asmehra at openjdk.org Mon Nov 10 18:24:24 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 18:24:24 GMT Subject: Integrated: 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly In-Reply-To: References: Message-ID: <6K6HTgQ6xen5g0a2MV3_dqUxn-IKvlSsZGhxK7i3pI0=.a643825d-fe9a-46fc-99cd-d6b335b137c7@github.com> On Fri, 7 Nov 2025 14:38:48 GMT, Ashutosh Mehra wrote: > The closure passed to `HashTable::iterate` in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` is returning incorrect value. If the search is successful, it should return false to terminate the iteration, but it is returning true. This patch fixes the return value of these closures. > In addition, I noticed `CompactHashTable::iterate` goes through all entries unconditionally, which is not optimal for cases where we may want to terminate the iteration when some condition is met. This is the case in `AdapterHandlerLibrary::contains` and `AdapterHandlerLibrary::print_handler_on` when it iterates over `_aot_adapter_handler_table`. This patch updates `CompactHashTable::iterate` to be the same as `HashTAble::iterate` by using return value of the closure to determine if the iteration should continue or abort. It also adds `CompactHashTable::iterate_all` to iterate all the values unconditionally and the users of `CompactHashTable::iterate` are updated to use `CompactHashTable::iterate_all`. This pull request has now been integrated. Changeset: cc54d2c0 Author: Ashutosh Mehra URL: https://git.openjdk.org/jdk/commit/cc54d2c06b0e1f799c771d747cfb4059a8774e28 Stats: 75 lines in 7 files changed: 38 ins; 10 del; 27 mod 8371418: Methods in AdapterHandlerLibrary use HashtableBase iterate method incorrectly Reviewed-by: kvn, adinn ------------- PR: https://git.openjdk.org/jdk/pull/28197 From eastigeevich at openjdk.org Mon Nov 10 18:48:00 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Nov 2025 18:48:00 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v10] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Sat, 8 Nov 2025 00:05:38 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > remove unncessary included test/hotspot/gtest/testutils.cpp line 94: > 92: if (address <= (mach_vm_address_t)addr && > 93: (address + region_size) >= ((mach_vm_address_t)addr + size)) { > 94: // Check if the user_tag matches VM_MEMORY_JAVA No need this comment because it just repeats what is written in `return`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2511602937 From eastigeevich at openjdk.org Mon Nov 10 18:50:51 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Nov 2025 18:50:51 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v10] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Sat, 8 Nov 2025 00:05:38 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > remove unncessary included test/hotspot/gtest/testutils.hpp line 54: > 52: #ifdef __APPLE__ > 53: // Check if a memory region is tagged with VM_MEMORY_JAVA on macOS > 54: // This function is used by multiple test files to validate BSD memory tagging Maybe we don't "This function ..." comment line because it might be misleading. The function is used in one test only. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2511610847 From asmehra at openjdk.org Mon Nov 10 19:05:07 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 19:05:07 GMT Subject: RFR: 8371493: Simplify search for AdapterHandlerEntry Message-ID: `AdapterHandlerEntry` stores a direct pointer to `AdapterBlob`. Therefore, when looking for a `AdapterHandlerEntry` corresponding to a `CodeBlob`, we can use direct comparison instead of using `CodeCache::find_blob`. It also replaces the call to `AdapterHandlerLibrary::contains` in `CodeBlob::dump_for_addr` with a more trivial check `is_adapter_blob`. ------------- Commit messages: - 8371493: Simplify search for AdapterHandlerEntry Changes: https://git.openjdk.org/jdk/pull/28223/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28223&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371493 Stats: 35 lines in 3 files changed: 1 ins; 31 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28223.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28223/head:pull/28223 PR: https://git.openjdk.org/jdk/pull/28223 From jkratochvil at openjdk.org Mon Nov 10 20:22:40 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 10 Nov 2025 20:22:40 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 15:04:42 GMT, Ashutosh Mehra wrote: >> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: >> >> Add Ioi Lam's comment > > src/hotspot/share/oops/resolvedFieldEntry.cpp line 31: > >> 29: STATIC_ASSERT(std::is_trivially_copyable_v == true); >> 30: >> 31: // Detect inadvertently introduced trailing padding. > > Another way to detect padding (trailing or internal) could be to compare sizeof(ResolvedFieldEntry) against the sum up the size of all the elements. Something like: > `sizeofResolvedFieldEntry) == (sizeof(InstanceKlass*) + sizeof(int) + sizeof(u2) + ... + sizeof(_padding))` > It looks cumbersome but I think it is easy enough to update if a new field is added because the static assert will fail immediately. I can implement this plan if requested, but I do not intend to do so on my own. Thanks for the idea, but one could argue that some reformatting could still produce a false PASS. I would suggest that this is better left for [C++26 Reflection](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2996r13.html). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2511858818 From liach at openjdk.org Mon Nov 10 20:27:39 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 10 Nov 2025 20:27:39 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 10:06:31 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: > > - Merge branch 'master' into JDK-8353835 > - Fix typo in test comment > - Merge branch 'master' into JDK-8353835 > - Merge branch 'master' into JDK-8353835 > - Suppress warnings from some tests > - Change -Xcheck:jni to be warning rather than fatal error > - Merge branch 'master' into JDK-8353835 > - Simplify filter > - Merge branch 'master' into JDK-8353835 > - Update Xcheck:jni description > - ... and 35 more: https://git.openjdk.org/jdk/compare/066810c8...6671ae02 I still wonder about the decision for JNI to call final Field.set with an unconditional export check instead of an unconditional open check - the open check is done for all Java code already. src/java.base/share/classes/java/lang/reflect/doc-files/MutationMethods.html line 56: > 54:
  • > 55: java.lang.reflect.Field.setDouble(Object, double)
  • > 56:
  • Nit: Since javadoc process tags here, you could just use `{@link Field#set java.lang.reflect.Field.set(Object, Object)}` instead of full-fledged a tags. src/java.base/share/classes/java/lang/reflect/doc-files/MutationMethods.html line 66: > 64:

    In the reference implementation, a module can be granted the capability to mutate > 65: final instance fields of classes in packages that are open to the module using > 66: the command line option --enable-final-field-mutation=M1,M2, ... M} where Suggestion: the command line option --enable-final-field-mutation=M1,M2, ... Mn} where src/java.base/share/classes/java/lang/reflect/doc-files/MutationMethods.html line 72: > 70: illegal. > 71: > 72: The command line option --illegal-final-field-mutation controls how illegal Missing `

    `? test/jdk/java/lang/reflect/Field/mutateFinals/cli/CommandLineTest.java line 234: > 232: @Test > 233: void testSetPropertyToAllow() throws Exception { > 234: test("setSystemPropertyToAllow+testFieldSetInt") I thought this was setting the property before the VM boot. Can we have another test that does something like: test("testFieldSetInt", "-Djdk.module.illegal.final.field.mutation=allow") Which I think is closer to what @vy asks for. ------------- PR Review: https://git.openjdk.org/jdk/pull/25115#pullrequestreview-3444803078 PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2511838513 PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2511840059 PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2511841318 PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2511708135 From liach at openjdk.org Mon Nov 10 20:27:41 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 10 Nov 2025 20:27:41 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v4] In-Reply-To: References: Message-ID: On Tue, 30 Sep 2025 08:12:25 GMT, Alan Bateman wrote: >> src/hotspot/share/runtime/arguments.cpp line 2281: >> >>> 2279: } >>> 2280: } else if (match_option(option, "--illegal-final-field-mutation=", &tail)) { >>> 2281: if (strcmp(tail, "allow") == 0 || strcmp(tail, "warn") == 0 || strcmp(tail, "debug") == 0 || strcmp(tail, "deny") == 0) { >> >> Is the `jdk.module.illegal.final.field.mutation` property intended as a public API? If so, where is it documented? > > System properties are used to "communicate" the value of options from the VM to the library code. All internal/undocumented. There is a test in mutateFinal/cli/CommandLineTests.java that checks that specifying the system property on the command line is not effective. In my followup investigation for how InternalProperty/internal() flag really works, I noted it has been effectively broken since the recent updates - back in 2018, the referenced VM.saveAndRemoveProperties is gone, and now filtering is done by System.createProperties. We should probably address that in another RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2511752805 From amenkov at openjdk.org Mon Nov 10 21:01:53 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Mon, 10 Nov 2025 21:01:53 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS Message-ID: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> FollowReferences with null initial_object starts heap walking from "heap roots", which include system classes. All oops from ClassLoaderDataGraph are reported with JVMTI_HEAP_REFERENCE_SYSTEM_CLASS kind, but some of the objects are not classes. The fix updates FollowReferences to report non-class objects from ClassLoaderDataGraph as JVMTI_HEAP_REFERENCE_OTHER Testing: tier1..4,hs-tier5-svc ------------- Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/28224/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28224&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371083 Stats: 211 lines in 3 files changed: 208 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28224/head:pull/28224 PR: https://git.openjdk.org/jdk/pull/28224 From eastigeevich at openjdk.org Mon Nov 10 22:04:32 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Nov 2025 22:04:32 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Mon, 10 Nov 2025 22:00:35 GMT, Evgeny Astigeevich wrote: >> Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: >> >> - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 >> - the only trampoline in ArrayCopyStub is never shared >> - fixup: a shared trampoline must branch to a statically bound method >> - share static call trampolines generated by C1 as well >> - assert callee is nullptr for runtime calls >> - assert that call sites offsets aren't missing >> - cleanup: rephrase comments in macroAssembler_aarch64.hpp >> - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 >> - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' >> - remove implementation-dependent logic from emit_shared_trampolines() >> - ... and 8 more: https://git.openjdk.org/jdk/compare/f6f87bb6...871903f4 > > test/hotspot/jtreg/compiler/sharedstubs/SharedRuntimeCallTrampolineTest.java line 87: > >> 85: >> 86: private static void checkOutput(OutputAnalyzer output) { >> 87: String testMethodStdout = getTestMethodStdout(output); > > Can you add a description what output format is expected? Adding an example will help a lot. The test expects runtime calls. What will result them to appear? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2512104896 From eastigeevich at openjdk.org Mon Nov 10 22:04:30 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Nov 2025 22:04:30 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Thu, 6 Nov 2025 17:59:41 GMT, Mikhail Ablakatov wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: > > - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 > - the only trampoline in ArrayCopyStub is never shared > - fixup: a shared trampoline must branch to a statically bound method > - share static call trampolines generated by C1 as well > - assert callee is nullptr for runtime calls > - assert that call sites offsets aren't missing > - cleanup: rephrase comments in macroAssembler_aarch64.hpp > - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 > - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' > - remove implementation-dependent logic from emit_shared_trampolines() > - ... and 8 more: https://git.openjdk.org/jdk/compare/f6f87bb6...871903f4 test/hotspot/jtreg/compiler/sharedstubs/SharedRuntimeCallTrampolineTest.java line 87: > 85: > 86: private static void checkOutput(OutputAnalyzer output) { > 87: String testMethodStdout = getTestMethodStdout(output); Can you add a description what output format is expected? Adding an example will help a lot. test/hotspot/jtreg/compiler/sharedstubs/SharedRuntimeCallTrampolineTest.java line 107: > 105: .map(reloc -> new String(reloc.addr())) > 106: .collect(Collectors.toList()); > 107: if (trampolineAddrs.stream().distinct().count() >= trampolineAddrs.size()) { For better readability, could you please create a meaningful variable for `trampolineAddrs.stream().distinct().count()`? You can reuse it in the exception message as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2512100636 PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2512094448 From eastigeevich at openjdk.org Mon Nov 10 22:15:10 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Nov 2025 22:15:10 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: <5rJnQSFjLI2FDH9KDsk07Mp0n4BiKe9XG0LMKgayAo8=.a80f10b3-4994-4b4c-bc03-a082d8eb650f@github.com> On Thu, 6 Nov 2025 17:59:41 GMT, Mikhail Ablakatov wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: > > - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 > - the only trampoline in ArrayCopyStub is never shared > - fixup: a shared trampoline must branch to a statically bound method > - share static call trampolines generated by C1 as well > - assert callee is nullptr for runtime calls > - assert that call sites offsets aren't missing > - cleanup: rephrase comments in macroAssembler_aarch64.hpp > - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 > - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' > - remove implementation-dependent logic from emit_shared_trampolines() > - ... and 8 more: https://git.openjdk.org/jdk/compare/f6f87bb6...871903f4 test/hotspot/jtreg/compiler/sharedstubs/SharedStaticCallTrampolineTest.java line 53: > 51: import jdk.test.lib.process.ProcessTools; > 52: > 53: public class SharedStaticCallTrampolineTest { Similar comments as to `SharedRuntimeCallTrampolineTest.java` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2512136551 From asmehra at openjdk.org Mon Nov 10 22:18:05 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 22:18:05 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 20:20:02 GMT, Jan Kratochvil wrote: >> src/hotspot/share/oops/resolvedFieldEntry.cpp line 31: >> >>> 29: STATIC_ASSERT(std::is_trivially_copyable_v == true); >>> 30: >>> 31: // Detect inadvertently introduced trailing padding. >> >> Another way to detect padding (trailing or internal) could be to compare sizeof(ResolvedFieldEntry) against the sum up the size of all the elements. Something like: >> `sizeofResolvedFieldEntry) == (sizeof(InstanceKlass*) + sizeof(int) + sizeof(u2) + ... + sizeof(_padding))` >> It looks cumbersome but I think it is easy enough to update if a new field is added because the static assert will fail immediately. > > I can implement this plan if requested, but I do not intend to do so on my own. > Thanks for the idea, but one could argue that some reformatting could still produce a false PASS. I would suggest that this is better left for [C++26 Reflection](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2996r13.html). Okay. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2512145283 From asmehra at openjdk.org Mon Nov 10 22:18:03 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Nov 2025 22:18:03 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 07:24:50 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Add Ioi Lam's comment Marked as reviewed by asmehra (Committer). lgtm ------------- PR Review: https://git.openjdk.org/jdk/pull/26098#pullrequestreview-3445383353 PR Comment: https://git.openjdk.org/jdk/pull/26098#issuecomment-3514110147 From eastigeevich at openjdk.org Mon Nov 10 22:22:16 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Nov 2025 22:22:16 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: <52JasB76bWD5S9vvAGsjitHHblK0jBqPNGnHr_x1lmM=.2940f5b7-ae97-4809-a4ea-3b4a64df961f@github.com> On Thu, 6 Nov 2025 17:59:41 GMT, Mikhail Ablakatov wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: > > - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 > - the only trampoline in ArrayCopyStub is never shared > - fixup: a shared trampoline must branch to a statically bound method > - share static call trampolines generated by C1 as well > - assert callee is nullptr for runtime calls > - assert that call sites offsets aren't missing > - cleanup: rephrase comments in macroAssembler_aarch64.hpp > - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 > - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' > - remove implementation-dependent logic from emit_shared_trampolines() > - ... and 8 more: https://git.openjdk.org/jdk/compare/f6f87bb6...871903f4 test/hotspot/jtreg/compiler/sharedstubs/SharedStaticCallTrampolineTest.java line 121: > 119: .filter(addr -> Collections.frequency(trampolineAddrs, addr) == 1) > 120: .collect(Collectors.toList()); > 121: if (uniqueTrampolineAddrs.size() == 0) { Should we expect this to be 1? Possible values: 0, 1 or 3? 0 - incorrect mapping of a call site 1 - everything is correct 3 - sharing does not work ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2512160175 From kvn at openjdk.org Mon Nov 10 22:23:11 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 10 Nov 2025 22:23:11 GMT Subject: RFR: 8371493: Simplify search for AdapterHandlerEntry In-Reply-To: References: Message-ID: <-uS6VSlyW0d5pDYsspu2piw5ZycamOQ9rrsgWFhUzZ4=.37b4829c-3ad8-4627-918a-ae80266afa40@github.com> On Mon, 10 Nov 2025 18:54:52 GMT, Ashutosh Mehra wrote: > `AdapterHandlerEntry` stores a direct pointer to `AdapterBlob`. Therefore, when looking for a `AdapterHandlerEntry` corresponding to a `CodeBlob`, we can use direct comparison instead of using `CodeCache::find_blob`. > It also replaces the call to `AdapterHandlerLibrary::contains` in `CodeBlob::dump_for_addr` with a more trivial check `is_adapter_blob`. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28223#pullrequestreview-3445407286 From iklam at openjdk.org Mon Nov 10 22:27:07 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 10 Nov 2025 22:27:07 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 07:24:50 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Add Ioi Lam's comment Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26098#pullrequestreview-3445423200 From dhanalla at openjdk.org Mon Nov 10 22:30:12 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Mon, 10 Nov 2025 22:30:12 GMT Subject: RFR: 8371161: [AArch64] Enable supported CPU features for the Qualcomm processor family Message-ID: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> This PR addresses the following: 1. Enables the UseSHA3Intrinsics flag for CPU_QUALCOM. Previously, this flag was only enabled for CPU_APPLE. Benchmark results show significant performance improvements on Qualcomm CPUs with this intrinsic. 2. Populates the _cpu type for Qualcomm CPUs and assigns the appropriate _variant value. Performance testing: The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs.

    Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66%
    ------------- Commit messages: - Enable supported CPU features for the Qualcomm processor family Changes: https://git.openjdk.org/jdk/pull/28166/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28166&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371161 Stats: 10 lines in 2 files changed: 3 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/28166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28166/head:pull/28166 PR: https://git.openjdk.org/jdk/pull/28166 From psandoz at openjdk.org Tue Nov 11 00:02:04 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Tue, 11 Nov 2025 00:02:04 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 15:19:48 GMT, Jatin Bhateja wrote: > Add new HalffloatVector type and corresponding concrete vector classes in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. > - Add necessary inline expander support. > - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. > - Use existing Float16 vector IR and backend support. > - Extended the existing VectorAPI JTREG test suite for the newly added HalffloatVector operations. > > The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). > > The following are the performance numbers for some of the selected HalfflotVector benchmarking kernels compared to equivalent Float16OperationsBenchmark kernels. > > {A2BA2D85-085A-489F-8DDD-0FCFB5986EA5} > > Initial RFP[1] was floated on the panama-dev mailing list. > > Kindly review the draft PR and share your feedback. > > Best Regards, > Jatin > > [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html Some quick comments. We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. When you generate the fallback code for unary/binary etc can you push the carrier type and conversations into the uOp/bOp implementations so you don't have to explicitly operate on the carrier type and do the conversions as you do now e.g.,: v0.uOp(m, (i, a) -> float16ToShortBits(Float16.valueOf(-(shortBitsToFloat16(($type$)a).floatValue())))); The transition of intrinsic arguments from `vsp.elementType()` to `vsp.carrierType(), vsp.operType()` is a little unfortunate. Is this because HotSpot cannot directly refer to the `Float16` class from the incubating module? Requiring two arguments means they can get out of sync. Previously the class provided all the information needed, now arguably the type does. ------------- PR Review: https://git.openjdk.org/jdk/pull/28002#pullrequestreview-3445662107 From ysuenaga at openjdk.org Tue Nov 11 00:24:04 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Tue, 11 Nov 2025 00:24:04 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM In-Reply-To: References: Message-ID: <-oDd88Q7tk18y-gDNqcma6fgRN_5Kqhjt6XErYjfNfo=.922b1083-276f-45c5-8946-b9f3b3c1dc94@github.com> On Sun, 2 Nov 2025 06:27:50 GMT, Yasumasa Suenaga wrote: > When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [linux-vdso.so.1+0xe69] > [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] > > Retrying call stack printing without source information... > > [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] > > > When I checked back trace on GDB, it failed at `assert`. > > #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", > line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", > detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 > > > > (gdb) f 13 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > 536 assert(false, "section header string table should be loaded"); > > > vDSO is not a regular ELF, so it should be skipped here. PING: Can I get Reviewers? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28102#issuecomment-3514453905 From darcy at openjdk.org Tue Nov 11 01:02:05 2025 From: darcy at openjdk.org (Joe Darcy) Date: Tue, 11 Nov 2025 01:02:05 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 23:58:57 GMT, Paul Sandoz wrote: > Some quick comments. > > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. > I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3514526479 From syan at openjdk.org Tue Nov 11 03:36:04 2025 From: syan at openjdk.org (SendaoYan) Date: Tue, 11 Nov 2025 03:36:04 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 17:43:02 GMT, Severin Gehwolf wrote: >> Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. >> >> The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. >> >> Testing (all on Linux x86_64): >> - [x] CG version 2, run as root. Engine: docker. Test is skipped. >> - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [X] CG version 1, run as non-root. Test skipped. >> - [x] GHA, though I don't think this is very useful for this change. >> >> Thoughts? > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > Indenting and copyright The touch tests and new tests run passed on my local machine. ------------- Marked as reviewed by syan (Committer). PR Review: https://git.openjdk.org/jdk/pull/28201#pullrequestreview-3446098295 From duke at openjdk.org Tue Nov 11 06:12:28 2025 From: duke at openjdk.org (Zihao Lin) Date: Tue, 11 Nov 2025 06:12:28 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v11] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'openjdk:master' into 8344116 - fix conflict - Merge master - remove C2AccessValuePtr - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - ... and 3 more: https://git.openjdk.org/jdk/compare/76a1109d...42b17827 ------------- Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=10 Stats: 230 lines in 18 files changed: 33 ins; 55 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From kbarrett at openjdk.org Tue Nov 11 06:26:03 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 11 Nov 2025 06:26:03 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v5] In-Reply-To: References: Message-ID: On Tue, 21 Oct 2025 14:58:30 GMT, Kim Barrett wrote: >> Please review this change that adds the type Atomic, to use as the type >> of a variable that is accessed (including writes) concurrently by multiple >> threads. This is intended to replace (most) uses of the current HotSpot idiom >> of declaring a variable volatile and accessing that variable using functions >> from the AtomicAccess class. >> https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 >> >> This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are >> >> * Substantially restructured `Atomic`, to be IDE friendly. It's >> operationally the same, with the same API, hence uses and gtests didn't need >> to change in that respect. Thanks to @stefank for raising this issue, and for >> some suggestions toward improvements. >> >> * Changed how fetch_then_set for atomic translated types is handled, to avoid >> having the function there at all if it isn't usable, rather than just removing >> it via SFINAE, leaving an empty overload set. >> >> * Added more gtests. >> >> Testing: mach5 tier1-6, GHA sanity tests > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > rename relaxed_store => store_relaxed More naming discussions have led to more name changes. | AtomicAccess | old PR | new PR | +=====================+=====================+=====================+ | store | store_relaxed | store_relaxed | | release_store | release_store | release_store | | release_store_fence | release_store_fence | release_store_fence | | | | | | load | load_relaxed | load_relaxed | | load_acquire | load_acquire | load_acquire | | | | | | add | add_then_fetch | add_then_fetch | | fetch_then_add | fetch_then_add | fetch_then_add | | | | | | sub | sub_then_fetch | sub_then_fetch | | | fetch_then_sub | fetch_then_sub | | | | | | inc | atomic_inc | [1] | new PR change | dec | atomic_dec | [1] | new PR change | | | | | xchg | fetch_then_set | exchange | new PR change | cmpxchg | cmpxchg | compare_exchange | new PR change | | | | | replace_if_null | replace_if_null | [2] | new PR change | | clear_if_equal | [2] | new PR change | | | | | fetch_then_and | fetch_then_and | fetch_then_and | | fetch_then_or | fetch_then_or | fetch_then_or | | fetch_then_xor | fetch_then_xor | fetch_then_xor | | | | | | and_then_fetch | and_then_fetch | and_then_fetch | | or_then_fetch | or_then_fetch | or_then_fetch | | xor_then_fetch | xor_then_fetch | xor_then_fetch | [1] The `inc` and `dec` operations originally existed to provide better codegen on some platforms for some cases where the value isn't needed. That got dropped somewhere along the line. For example, `AtomicAccess::inc` just calls `AtomicAccess::add` with an addend of 1. The assumption is that if the result isn't used then the implementation (perhaps with help from the C++ compiler) can take care of appropriate optimization. They enabled the use of x86 locked inc/dec instructions, which doesn't happen with the current code. We could probably get better codegen for x86 by using compiler intrinsics, instead of using inline assembler. [2] `AtomicAccess::replace_if_null` was added because of complaints about `cmpxchg` with a `NULL` (now `nullptr`) argument needing a cast, in order to deal with argument type deduction. I added `Atomic::clear_if_equal` because I've wanted `AtomicAccess::clear_if_equal` in several places, but never got around to adding it. But it turns out that not only don't we need casts in the Atomic layer, but there are a couple of ways to avoid the need for them in the AtomicAccess layer too. So these seem of questionable utility. I've also removed the new `AtomicNextAccess` class and backed out it's uses. This is to reduce the size and scope of this PR to simplify its review. My intent is to propose `AtomicNextAccess` in a followup. The purpose of `AtomicNextAccess` is to permit conversions of clients of `LockFreeStack` and `NonblockingQueue` to be done incrementally, rather than requiring them to all be done in one change. It's useful to know how that could be done, but doesn't have to be part of the initial `Atomic` change. Once all those clients have been converted, `AtomicNextAccess` can be removed. Note that `NonblockingQueue` no longer has any clients. When I wrote `AtomicNextAccess` it was being used by the G1 post-barrier. But with the recent overhaul of the post-barrier, that use is gone. It was originally internal to G1, but the Google folks wanted to use it for something else (JDK-8236485: "Epoch synchronization protocol for G1 concurrent refinement", which is also rendered moot by the recent G1 post-barrier changes. But I think they had some other use in mind too?). So it got moved from gc/g1 to utilities, renamed, and improved in various ways. Other Google uses don't seem to have materialized though. But maybe it should just be deleted now? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27539#issuecomment-3515172882 From kbarrett at openjdk.org Tue Nov 11 06:26:01 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 11 Nov 2025 06:26:01 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v6] In-Reply-To: References: Message-ID: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> > Please review this change that adds the type Atomic, to use as the type > of a variable that is accessed (including writes) concurrently by multiple > threads. This is intended to replace (most) uses of the current HotSpot idiom > of declaring a variable volatile and accessing that variable using functions > from the AtomicAccess class. > https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 > > This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are > > * Substantially restructured `Atomic`, to be IDE friendly. It's > operationally the same, with the same API, hence uses and gtests didn't need > to change in that respect. Thanks to @stefank for raising this issue, and for > some suggestions toward improvements. > > * Changed how fetch_then_set for atomic translated types is handled, to avoid > having the function there at all if it isn't usable, rather than just removing > it via SFINAE, leaving an empty overload set. > > * Added more gtests. > > Testing: mach5 tier1-6, GHA sanity tests Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: - Merge branch 'master' into atomic-template-tag-select - remove AtomicNextAccess and uses - use type_traits wrapper in new code - Merge branch 'master' into atomic-template-tag-select - more naming updates - rename relaxed_store => store_relaxed - default construct translated atomic without SFINAE - Merge branch 'master' into atomic-template-tag-select - Merge branch 'master' into atomic-template-tag-select - add reference to gcc bug we're working around - ... and 10 more: https://git.openjdk.org/jdk/compare/e6d6859b...da58d0d2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27539/files - new: https://git.openjdk.org/jdk/pull/27539/files/f7e2d950..da58d0d2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27539&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27539&range=04-05 Stats: 257095 lines in 2275 files changed: 164199 ins; 55224 del; 37672 mod Patch: https://git.openjdk.org/jdk/pull/27539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27539/head:pull/27539 PR: https://git.openjdk.org/jdk/pull/27539 From kbarrett at openjdk.org Tue Nov 11 06:43:07 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 11 Nov 2025 06:43:07 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 07:24:50 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > Add Ioi Lam's comment src/hotspot/share/oops/resolvedFieldEntry.cpp line 29: > 27: #include "oops/resolvedFieldEntry.hpp" > 28: > 29: STATIC_ASSERT(std::is_trivially_copyable_v == true); Style nit: `STATIC_ASSERT` shouldn't be used anymore. C++17 gives us 1-arg `static_assert`. Also, explicit comparison to `true` is weird. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2513009462 From jpai at openjdk.org Tue Nov 11 06:46:04 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Tue, 11 Nov 2025 06:46:04 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. src/java.base/share/native/libjimage/imageFile.cpp line 334: > 332: // Memory map image (minimally the index.) > 333: _index_data = (u1*)osSupport::map_memory(_fd, _name, 0, (size_t)map_size()); > 334: if (_index_data == nullptr) { The rest of the code in the `libjimage` library uses `NULL`, including the return value in `osSupport::map_memory(...)`. So I think it would be better to use `NULL` here for consistency. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28087#discussion_r2513032659 From jpai at openjdk.org Tue Nov 11 06:50:04 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Tue, 11 Nov 2025 06:50:04 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. The change looks reasonable to me. This function gets called from `ClassLoader::lookup_vm_options()` and returning `false` from here appears to be handled correctly in the hotspot classfile code. It would be good to have Alan or Sundar review this change before integrating @AlanBateman @sundararajana. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28087#issuecomment-3515228335 From alanb at openjdk.org Tue Nov 11 08:36:12 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 11 Nov 2025 08:36:12 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 20:11:52 GMT, Chen Liang wrote: >> Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: >> >> - Merge branch 'master' into JDK-8353835 >> - Fix typo in test comment >> - Merge branch 'master' into JDK-8353835 >> - Merge branch 'master' into JDK-8353835 >> - Suppress warnings from some tests >> - Change -Xcheck:jni to be warning rather than fatal error >> - Merge branch 'master' into JDK-8353835 >> - Simplify filter >> - Merge branch 'master' into JDK-8353835 >> - Update Xcheck:jni description >> - ... and 35 more: https://git.openjdk.org/jdk/compare/066810c8...6671ae02 > > src/java.base/share/classes/java/lang/reflect/doc-files/MutationMethods.html line 56: > >> 54:
  • >> 55: java.lang.reflect.Field.setDouble(Object, double)
  • >> 56:
  • > > Nit: Since javadoc process tags here, you could just use `{@link Field#set java.lang.reflect.Field.set(Object, Object)}` instead of full-fledged a tags. I wasn't aware that javadoc allowed this in .html docs-files, thanks for the tip. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2513298915 From alanb at openjdk.org Tue Nov 11 09:00:11 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 11 Nov 2025 09:00:11 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: <1aQu6ywsFGh3TMN3XjBevjqmTFP7CdQRrbzQtuN-wfI=.814c0236-caa8-469a-9a42-dfafb56ebf64@github.com> On Mon, 10 Nov 2025 20:24:53 GMT, Chen Liang wrote: > I still wonder about the decision for JNI to call final Field.set with an unconditional export check instead of an unconditional open check - the open check is done for all Java code already. It's aligned with setAccessible. It's a bit of corner case, but if a JNI attached thread invokes setAccessible with no java frames on the stack, then it specified to only succeed if the API element is public and declared in a public class in an exported package. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3515650487 From sgehwolf at openjdk.org Tue Nov 11 09:21:43 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 11 Nov 2025 09:21:43 GMT Subject: RFR: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 17:43:02 GMT, Severin Gehwolf wrote: >> Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. >> >> The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. >> >> Testing (all on Linux x86_64): >> - [x] CG version 2, run as root. Engine: docker. Test is skipped. >> - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) >> - [X] CG version 1, run as non-root. Test skipped. >> - [x] GHA, though I don't think this is very useful for this change. >> >> Thoughts? > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > Indenting and copyright Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28201#issuecomment-3515752649 From sgehwolf at openjdk.org Tue Nov 11 09:21:45 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 11 Nov 2025 09:21:45 GMT Subject: Integrated: 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 In-Reply-To: References: Message-ID: On Fri, 7 Nov 2025 16:41:26 GMT, Severin Gehwolf wrote: > Please review this test-only enhancement. It creates a regression test for the Amazon ECS setup on cgroups v1 where the parent memory limit isn't visible inside the container and, thus, needs to rely on the cg v1 specific `hierarchical_memory_limit` token in `memory.stat`. The proposed test is cg v1 only and needs to be run as root. It's skipped otherwise. It's useful to have when working on refactorings like #27743 so as not to regress. > > The other changes are an effort to reduce code duplication in the test code where similar patterns have been used in other container tests. > > Testing (all on Linux x86_64): > - [x] CG version 2, run as root. Engine: docker. Test is skipped. > - [x] CG version 1, run as root. Engine: docker. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [x] CG version 1, run as root. Engine: podman. Test passes and fails without the product fix of [JDK-8370572](https://bugs.openjdk.org/browse/JDK-8370572) > - [X] CG version 1, run as non-root. Test skipped. > - [x] GHA, though I don't think this is very useful for this change. > > Thoughts? This pull request has now been integrated. Changeset: 29100320 Author: Severin Gehwolf URL: https://git.openjdk.org/jdk/commit/291003208c025ce4f9a94ba6093e207d0792bbb9 Stats: 191 lines in 8 files changed: 154 ins; 32 del; 5 mod 8370966: Create regression test for the hierarchical memory limit fix in JDK-8370572 Reviewed-by: shade, syan ------------- PR: https://git.openjdk.org/jdk/pull/28201 From duke at openjdk.org Tue Nov 11 09:39:09 2025 From: duke at openjdk.org (Harshit470250) Date: Tue, 11 Nov 2025 09:39:09 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages @iwanowww Can you also take a look, as you have reviewed the previous related change. #21782 ------------- PR Comment: https://git.openjdk.org/jdk/pull/27279#issuecomment-3515880656 From adinn at openjdk.org Tue Nov 11 09:58:07 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Tue, 11 Nov 2025 09:58:07 GMT Subject: RFR: 8371493: Simplify search for AdapterHandlerEntry In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 18:54:52 GMT, Ashutosh Mehra wrote: > `AdapterHandlerEntry` stores a direct pointer to `AdapterBlob`. Therefore, when looking for a `AdapterHandlerEntry` corresponding to a `CodeBlob`, we can use direct comparison instead of using `CodeCache::find_blob`. > This patch also replaces the call to `AdapterHandlerLibrary::contains` in `CodeBlob::dump_for_addr` with a more trivial check `is_adapter_blob`. Looks good. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28223#pullrequestreview-3447205571 From azafari at openjdk.org Tue Nov 11 10:13:44 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 11 Nov 2025 10:13:44 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v7] In-Reply-To: References: Message-ID: > Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. > > Tests: > mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: Windows warning bypassed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27288/files - new: https://git.openjdk.org/jdk/pull/27288/files/f0d4dfbd..b00636bf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=05-06 Stats: 5 lines in 1 file changed: 2 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27288.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27288/head:pull/27288 PR: https://git.openjdk.org/jdk/pull/27288 From jsjolen at openjdk.org Tue Nov 11 10:52:18 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 11 Nov 2025 10:52:18 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v11] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: On Thu, 30 Oct 2025 12:06:00 GMT, Afshin Zafari wrote: >> The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. >> The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. >> >> Tests: >> linux-x64 tier1 > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > fix arguments.cpp for HeapMinBaseAddress type. Seems correct, LGTM ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26955#pullrequestreview-3447447381 From stefank at openjdk.org Tue Nov 11 11:17:29 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 11 Nov 2025 11:17:29 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v6] In-Reply-To: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> References: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> Message-ID: On Tue, 11 Nov 2025 06:26:01 GMT, Kim Barrett wrote: >> Please review this change that adds the type Atomic, to use as the type >> of a variable that is accessed (including writes) concurrently by multiple >> threads. This is intended to replace (most) uses of the current HotSpot idiom >> of declaring a variable volatile and accessing that variable using functions >> from the AtomicAccess class. >> https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 >> >> This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are >> >> * Substantially restructured `Atomic`, to be IDE friendly. It's >> operationally the same, with the same API, hence uses and gtests didn't need >> to change in that respect. Thanks to @stefank for raising this issue, and for >> some suggestions toward improvements. >> >> * Changed how fetch_then_set for atomic translated types is handled, to avoid >> having the function there at all if it isn't usable, rather than just removing >> it via SFINAE, leaving an empty overload set. >> >> * Added more gtests. >> >> Testing: mach5 tier1-6, GHA sanity tests > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: > > - Merge branch 'master' into atomic-template-tag-select > - remove AtomicNextAccess and uses > - use type_traits wrapper in new code > - Merge branch 'master' into atomic-template-tag-select > - more naming updates > - rename relaxed_store => store_relaxed > - default construct translated atomic without SFINAE > - Merge branch 'master' into atomic-template-tag-select > - Merge branch 'master' into atomic-template-tag-select > - add reference to gcc bug we're working around > - ... and 10 more: https://git.openjdk.org/jdk/compare/8e31dc62...da58d0d2 I think this looks very nice and readable. I'm approving this. I'm also leaving a few comments below on things that I reacted to and that you might want to address. src/hotspot/share/runtime/atomic.hpp line 36: > 34: // Atomic is used to declare a variable of type T with atomic access. > 35: // > 36: // The following value types T are supported: It would be nice to explain how enums fit into all this. In offline discussions that question was raised. It would be nice to get that clarified in this comment. src/hotspot/share/runtime/atomic.hpp line 105: > 103: // element arithmetic. > 104: // > 105: // (4) An atomic translated type additionally provides the exchange It is a little odd that `exchange` and `compare_exchange` behave differently here. Is that only because of how `AtomicAccess` is currently implemented? src/hotspot/share/runtime/atomic.hpp line 282: > 280: template > 281: class AtomicImpl::SupportsExchange : public CommonCore { > 282: using Base = CommonCore; I don't see how this `using Base` aids in the readability. It is only used in the constructor, and there it just becomes an extra level of indirection. The same comment goes for the other `using Base` instances. src/hotspot/share/runtime/atomic.hpp line 300: > 298: > 299: // Guarding the AtomicAccess calls with constexpr checking of I produces > 300: // better compile-time error messages. It is unclear what `I` is meant to be a short name for. I don't think it is `integer` because we are also dealing with pointers. I don't think it can be `increment` because some functions pass the decrement amount. Maybe `V` (for value) could be more suitable here? But that could conflict with the contained _value in Atomic. In `check_i` you have this code and comment: } else if constexpr (std::is_signed_v) { static_assert(std::is_signed_v, "value is signed but offset is unsigned"); So maybe it could be `O` for offset? If `O` looks too much like `0` then maybe spell it out as `Offset`? Is there a collective name for "add value" and "sub value"? src/hotspot/share/runtime/atomic.hpp line 442: > 440: public: > 441: static constexpr bool value = std::is_pointer_v; > 442: }; This little bit of time to understand and I tried to figure out why return `char*` or `char`, but then I see that it is only used in `is_pointer_v` test. Could this have been using `void*` and `void` instead, or does this need to use a type that can be instantiated? ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27539#pullrequestreview-3447468889 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2513766104 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2513774450 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2513782743 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2513808761 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2513819494 From jsjolen at openjdk.org Tue Nov 11 11:37:19 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 11 Nov 2025 11:37:19 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v6] In-Reply-To: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> References: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> Message-ID: <3RMTU6vni7DQ91_94sZK3zlcN693NdxsF2ViuMcw478=.c90f9d4d-51db-40fa-a46f-0e58e813a6d0@github.com> On Tue, 11 Nov 2025 06:26:01 GMT, Kim Barrett wrote: >> Please review this change that adds the type Atomic, to use as the type >> of a variable that is accessed (including writes) concurrently by multiple >> threads. This is intended to replace (most) uses of the current HotSpot idiom >> of declaring a variable volatile and accessing that variable using functions >> from the AtomicAccess class. >> https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 >> >> This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are >> >> * Substantially restructured `Atomic`, to be IDE friendly. It's >> operationally the same, with the same API, hence uses and gtests didn't need >> to change in that respect. Thanks to @stefank for raising this issue, and for >> some suggestions toward improvements. >> >> * Changed how fetch_then_set for atomic translated types is handled, to avoid >> having the function there at all if it isn't usable, rather than just removing >> it via SFINAE, leaving an empty overload set. >> >> * Added more gtests. >> >> Testing: mach5 tier1-6, GHA sanity tests > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: > > - Merge branch 'master' into atomic-template-tag-select > - remove AtomicNextAccess and uses > - use type_traits wrapper in new code > - Merge branch 'master' into atomic-template-tag-select > - more naming updates > - rename relaxed_store => store_relaxed > - default construct translated atomic without SFINAE > - Merge branch 'master' into atomic-template-tag-select > - Merge branch 'master' into atomic-template-tag-select > - add reference to gcc bug we're working around > - ... and 10 more: https://git.openjdk.org/jdk/compare/99f95a49...da58d0d2 I found a couple of bugs where we forgot to pass along the memory order. Other than that, this seems good! Great job! src/hotspot/share/runtime/atomic.hpp line 276: > 274: T compare_exchange(T compare_value, T new_value, > 275: atomic_memory_order order = memory_order_conservative) { > 276: return AtomicAccess::cmpxchg(value_ptr(), compare_value, new_value); Bug: This isn't providing the `order` to `cmpxchg` src/hotspot/share/runtime/atomic.hpp line 291: > 289: T exchange(T new_value, > 290: atomic_memory_order order = memory_order_conservative) { > 291: return AtomicAccess::xchg(this->value_ptr(), new_value); Bug: This isn't providing the order to cmpxchg ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27539#pullrequestreview-3447626292 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2513896902 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2513898530 From jsjolen at openjdk.org Tue Nov 11 12:06:27 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 11 Nov 2025 12:06:27 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM In-Reply-To: References: Message-ID: On Sun, 2 Nov 2025 06:27:50 GMT, Yasumasa Suenaga wrote: > When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [linux-vdso.so.1+0xe69] > [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] > > Retrying call stack printing without source information... > > [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] > > > When I checked back trace on GDB, it failed at `assert`. > > #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", > line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", > detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 > > > > (gdb) f 13 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > 536 assert(false, "section header string table should be loaded"); > > > vDSO is not a regular ELF, so it should be skipped here. Hi, The actual issue here is that the vDSO file doesn't exist, so can't be opened, right? Then, we should instead check in `get_elf_file` whether the creation of the `ElfFile` succeeded instead. This should be done anyway, the `vDSO` issue is just a symptom of another bug. I think that this new definition should be used instead, in `decoder_elf.cpp:104`. ```c++ ElfFile* ElfDecoder::get_elf_file(const char* filepath) { ElfFile* file; file = _opened_elf_files; while (file != nullptr) { if (file->same_elf_file(filepath)) { return file; } file = file->next(); } file = new (std::nothrow) ElfFile(filepath); if (file == nullptr) { return nullptr; } else if (file->get_status() != NullDecoder::no_error) { return nullptr; } if (_opened_elf_files != nullptr) { file->set_next(_opened_elf_files); } _opened_elf_files = file; return file; } What do you think? ------------- PR Review: https://git.openjdk.org/jdk/pull/28102#pullrequestreview-3447747095 From epeter at openjdk.org Tue Nov 11 12:11:06 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 11 Nov 2025 12:11:06 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations In-Reply-To: References: Message-ID: <_ryF0SNpSLahH4HkGqSnGKc_6d9P1fWrKYTS0jRPvtk=.ff2143aa-d3a5-4776-bdd0-95646dfd35e9@github.com> On Mon, 27 Oct 2025 15:19:48 GMT, Jatin Bhateja wrote: > Add new HalffloatVector type and corresponding concrete vector classes in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. > - Add necessary inline expander support. > - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. > - Use existing Float16 vector IR and backend support. > - Extended the existing VectorAPI JTREG test suite for the newly added HalffloatVector operations. > > The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). > > The following are the performance numbers for some of the selected HalfflotVector benchmarking kernels compared to equivalent Float16OperationsBenchmark kernels. > > {A2BA2D85-085A-489F-8DDD-0FCFB5986EA5} > > Initial RFP[1] was floated on the panama-dev mailing list. > > Kindly review the draft PR and share your feedback. > > Best Regards, > Jatin > > [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html We already have a lot of things in the codebase now from previous issues that use `HF` everywhere, for example some node names, and the type. Should we maybe rename all of them to `F16`, or something else? Open question, not sure of the answer yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3516579087 From egahlin at openjdk.org Tue Nov 11 12:40:21 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 11 Nov 2025 12:40:21 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v7] In-Reply-To: References: Message-ID: On Tue, 7 Oct 2025 07:59:25 GMT, Alan Bateman wrote: > The APIs are in Field and Lookup so having the API method as the top frame is useful. It would be possible to reduce the filter to `{ "java.lang.reflect.ReflectAccess", "java.lang.invoke.MethodHandles$Lookup::unreflectField" }` with determineStackTraceOffset returning 6 but it's too fiddly and requires knowing about two "faraway places" when doing any refactoring. Mutating final fields is the slow path so performance is not a concern. So I think the trade-off to keep it as maintainable as possible is okay. The test checks the top frame and also scans the StackFilter to ensure the class is visible and that any filter value with a method name at least names a method that is declared in the class. We shouldn't push it as high as 6, that's fragile, but the offer(...) method could be skipped immediately if the offset is bumped. Class filters avoid specifying individual methods, which are more likely to be refactored. I can see the argument for not having the user's method as the top frame. A user may get a quick hint (instead of looking at the line number) if they see something like setInt(...), but this doesn?t work as well with tooling when you want to group stack traces by top frame, for example in a tree view. You typically want to see the application frame and then expand the nodes. If setInt, setFloat, setLong, etc. appear as the top nodes, users have to click and expand every setter, instead of seeing an aggregated list directly of packages, classes, or methods where finals are modified. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3516700622 From jwaters at openjdk.org Tue Nov 11 13:01:23 2025 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 11 Nov 2025 13:01:23 GMT Subject: RFR: 8342769: HotSpot Windows/gcc port is broken [v17] In-Reply-To: References: Message-ID: > Several areas in HotSpot are broken in the gcc port. These, with the exception of 1 rather big oversight within SharedRuntime::frem and SharedRuntime::drem, are all minor correctness issues within the code. These mostly can be fixed with simple changes to the code. Note that I am not sure whether the SharedRuntime::frem and SharedRuntime::drem fix is correct. It may be that they can be removed entirely Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: - Merge branch 'master' into hotspot - CAST_FROM_FN_PTR in os_windows.cpp - Merge branch 'master' into hotspot - Merge branch 'openjdk:master' into hotspot - _WINDOWS && AARCH64 in sharedRuntime.hpp - AARCH64 in sharedRuntimeRem.cpp - Refactor sharedRuntime.cpp - CAST_FROM_FN_PTR in os_windows.cpp - Merge branch 'openjdk:master' into hotspot - fmod_winarm64 in sharedRuntime.cpp - ... and 20 more: https://git.openjdk.org/jdk/compare/29100320...b93febb3 ------------- Changes: https://git.openjdk.org/jdk/pull/21627/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21627&range=16 Stats: 54 lines in 7 files changed: 23 ins; 7 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/21627.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/21627/head:pull/21627 PR: https://git.openjdk.org/jdk/pull/21627 From cnorrbin at openjdk.org Tue Nov 11 13:24:03 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Tue, 11 Nov 2025 13:24:03 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Mon, 10 Nov 2025 17:22:34 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > One more comment fix Looks good to me! ------------- Marked as reviewed by cnorrbin (Committer). PR Review: https://git.openjdk.org/jdk/pull/27743#pullrequestreview-3448050849 From rrich at openjdk.org Tue Nov 11 14:05:14 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 11 Nov 2025 14:05:14 GMT Subject: RFR: 8370244: [PPC64] Several vector tests fail on Power8 In-Reply-To: References: Message-ID: <08l1ZXQVA0KbVqlT229SLr4S33gxynG8eEtR9gHORf8=.c1b4b47c-eb6a-47a7-9b7c-75d30c6cc01f@github.com> On Mon, 10 Nov 2025 11:01:33 GMT, Martin Doerr wrote: > This is a workaround for [JDK-8370803](https://bugs.openjdk.org/browse/JDK-8370803). Power8 uses `MaxVectorSize`=8 by default. All tests are passing with `EnableVectorSupport` disabled. Looks good to me. Cheers, Richard. ------------- Marked as reviewed by rrich (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28214#pullrequestreview-3448199490 From aph at openjdk.org Tue Nov 11 14:13:11 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 11 Nov 2025 14:13:11 GMT Subject: RFR: 8371161: [AArch64] Enable supported CPU features for the Qualcomm processor family In-Reply-To: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: On Wed, 5 Nov 2025 21:56:50 GMT, Dhamoder Nalla wrote: > Enable UseSHA3Intrinsics on Qualcomm AArch64 CPUs and correctly identify Qualcomm processor model/variant on Windows ARM/aarch64 platforms. Previously the SHA?3 intrinsics were only enabled for Apple CPUs. This change allows Qualcomm systems to benefit from hardware-accelerated SHA?3 and improves CPU feature reporting by populating _cpu, _variant, and _revision for aarch64 platforms. > > Performance testing: > The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs. > > > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" > xmlns="http://www.w3.org/TR/REC-html40"> > > > > > > > > > > >
    > >
    > >
    > >
    > > Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement > -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- > MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% > MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% > MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% > MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% > MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% > MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% > MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% > MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66% > > > >
    > >
    > >
    > >
    > > > > > Please describe the change in more detail. It doesn't appear to be Qualcomm-specific at all. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28166#issuecomment-3517106199 From mdoerr at openjdk.org Tue Nov 11 14:29:39 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 11 Nov 2025 14:29:39 GMT Subject: RFR: 8370244: [PPC64] Several vector tests fail on Power8 In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 11:01:33 GMT, Martin Doerr wrote: > This is a workaround for [JDK-8370803](https://bugs.openjdk.org/browse/JDK-8370803). Power8 uses `MaxVectorSize`=8 by default. All tests are passing with `EnableVectorSupport` disabled. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28214#issuecomment-3517171198 From mdoerr at openjdk.org Tue Nov 11 14:29:40 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 11 Nov 2025 14:29:40 GMT Subject: Integrated: 8370244: [PPC64] Several vector tests fail on Power8 In-Reply-To: References: Message-ID: <_1stEOYolnOPbZRYvcAgCNRzqQDS_JiL5q86D4cmSas=.be2b74c7-5343-4836-98f5-200a880addd8@github.com> On Mon, 10 Nov 2025 11:01:33 GMT, Martin Doerr wrote: > This is a workaround for [JDK-8370803](https://bugs.openjdk.org/browse/JDK-8370803). Power8 uses `MaxVectorSize`=8 by default. All tests are passing with `EnableVectorSupport` disabled. This pull request has now been integrated. Changeset: cbd77fc9 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/cbd77fc9f3e6c8f1e996b30afe208c6a074cce3a Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod 8370244: [PPC64] Several vector tests fail on Power8 Reviewed-by: dbriemann, rrich ------------- PR: https://git.openjdk.org/jdk/pull/28214 From sgehwolf at openjdk.org Tue Nov 11 14:32:57 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 11 Nov 2025 14:32:57 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v5] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: - Add space in trace log - Merge branch 'master' into jdk-8365606-jlong-julong-refactor - One more comment fix - Extract OSContainer::available_swap_in_bytes() - Simplify os::used_memory() - Fix os::active_processor_count() - os::free_memory => use 'value' directly - os::available_memory() => use 'value' directly - Fix pids_max printing in VM.info - Better logging for -1 (cpu_shares) - ... and 14 more: https://git.openjdk.org/jdk/compare/29100320...0958b10f ------------- Changes: https://git.openjdk.org/jdk/pull/27743/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=04 Stats: 1308 lines in 16 files changed: 514 ins; 106 del; 688 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From alanb at openjdk.org Tue Nov 11 14:34:33 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 11 Nov 2025 14:34:33 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 19:26:25 GMT, Chen Liang wrote: >> Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: >> >> - Merge branch 'master' into JDK-8353835 >> - Fix typo in test comment >> - Merge branch 'master' into JDK-8353835 >> - Merge branch 'master' into JDK-8353835 >> - Suppress warnings from some tests >> - Change -Xcheck:jni to be warning rather than fatal error >> - Merge branch 'master' into JDK-8353835 >> - Simplify filter >> - Merge branch 'master' into JDK-8353835 >> - Update Xcheck:jni description >> - ... and 35 more: https://git.openjdk.org/jdk/compare/066810c8...6671ae02 > > test/jdk/java/lang/reflect/Field/mutateFinals/cli/CommandLineTest.java line 234: > >> 232: @Test >> 233: void testSetPropertyToAllow() throws Exception { >> 234: test("setSystemPropertyToAllow+testFieldSetInt") > > I thought this was setting the property before the VM boot. Can we have another test that does something like: > > test("testFieldSetInt", "-Djdk.module.illegal.final.field.mutation=allow") > > Which I think is closer to what @vy asks for. The test sets the internal property at runtime in the launched VM. You are right that another test could launch with the internal property set on the command line with -D. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2514460463 From sgehwolf at openjdk.org Tue Nov 11 14:43:10 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 11 Nov 2025 14:43:10 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Tue, 11 Nov 2025 13:21:38 GMT, Casper Norrbin wrote: > Looks good to me! Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3517235877 From liach at openjdk.org Tue Nov 11 15:04:38 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 11 Nov 2025 15:04:38 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: <1aQu6ywsFGh3TMN3XjBevjqmTFP7CdQRrbzQtuN-wfI=.814c0236-caa8-469a-9a42-dfafb56ebf64@github.com> References: <1aQu6ywsFGh3TMN3XjBevjqmTFP7CdQRrbzQtuN-wfI=.814c0236-caa8-469a-9a42-dfafb56ebf64@github.com> Message-ID: <-nBSdu-3kaHjMyaZTD8epppqP4EjowO5kvo3eTakJdg=.bec77fef-edc2-48b1-b961-d646fc994810@github.com> On Tue, 11 Nov 2025 08:56:56 GMT, Alan Bateman wrote: > It's aligned with setAccessible. It's corner case of course but if a JNI attached thread invokes setAccessible with no java frames on the stack, then it is specified to only succeed if the API element is public and declared in a public class in an exported package. Consider setting the field `java.lang.constant.DirectMethodHandleDesc$Kind.refKind` (public final instance field in public class in exported, non-open package) in 3 ways: 1. `Field.setAccessible` + `set` in Java code: Now `set` fails without `--add-opens` (not open) 2. Performing the 2 Java calls in JNI: Completely permitted (exported) 3. jni_Set##Result##Field: Completely permitted, one warning message I find it a bit weird that 1 is inconsistent with 2, but given case 3, we have plenty of time to restrict 2 and 3 together in future releases. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3517325736 From asmehra at openjdk.org Tue Nov 11 15:06:58 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Nov 2025 15:06:58 GMT Subject: RFR: 8371493: Simplify search for AdapterHandlerEntry In-Reply-To: <-uS6VSlyW0d5pDYsspu2piw5ZycamOQ9rrsgWFhUzZ4=.37b4829c-3ad8-4627-918a-ae80266afa40@github.com> References: <-uS6VSlyW0d5pDYsspu2piw5ZycamOQ9rrsgWFhUzZ4=.37b4829c-3ad8-4627-918a-ae80266afa40@github.com> Message-ID: On Mon, 10 Nov 2025 22:20:37 GMT, Vladimir Kozlov wrote: >> `AdapterHandlerEntry` stores a direct pointer to `AdapterBlob`. Therefore, when looking for a `AdapterHandlerEntry` corresponding to a `CodeBlob`, we can use direct comparison instead of using `CodeCache::find_blob`. >> This patch also replaces the call to `AdapterHandlerLibrary::contains` in `CodeBlob::dump_for_addr` with a more trivial check `is_adapter_blob`. > > Good. @vnkozlov @adinn thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28223#issuecomment-3517340317 From asmehra at openjdk.org Tue Nov 11 15:10:22 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Nov 2025 15:10:22 GMT Subject: Integrated: 8371493: Simplify search for AdapterHandlerEntry In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 18:54:52 GMT, Ashutosh Mehra wrote: > `AdapterHandlerEntry` stores a direct pointer to `AdapterBlob`. Therefore, when looking for a `AdapterHandlerEntry` corresponding to a `CodeBlob`, we can use direct comparison instead of using `CodeCache::find_blob`. > This patch also replaces the call to `AdapterHandlerLibrary::contains` in `CodeBlob::dump_for_addr` with a more trivial check `is_adapter_blob`. This pull request has now been integrated. Changeset: bbeb6bf0 Author: Ashutosh Mehra URL: https://git.openjdk.org/jdk/commit/bbeb6bf0ac8952feaf8afc9c9b25a9a372c2c798 Stats: 35 lines in 3 files changed: 1 ins; 31 del; 3 mod 8371493: Simplify search for AdapterHandlerEntry Reviewed-by: kvn, adinn ------------- PR: https://git.openjdk.org/jdk/pull/28223 From ayang at openjdk.org Tue Nov 11 15:48:31 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 11 Nov 2025 15:48:31 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch Message-ID: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Trivial removing obsoleted code for unsupported arch. Test: tier1 ------------- Commit messages: - remove-tlab-reserve Changes: https://git.openjdk.org/jdk/pull/28240/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28240&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371643 Stats: 29 lines in 3 files changed: 0 ins; 28 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28240.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28240/head:pull/28240 PR: https://git.openjdk.org/jdk/pull/28240 From kvn at openjdk.org Tue Nov 11 16:20:20 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 11 Nov 2025 16:20:20 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch In-Reply-To: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: <8ikwqN309ZPAORjL2YvE1hgvChrTfhi3slz1r4XIK5E=.41d37c1b-ae3e-43f5-9fe0-43ae2294e57c@github.com> On Tue, 11 Nov 2025 15:42:21 GMT, Albert Mingkun Yang wrote: > Trivial removing obsoleted code for unsupported arch. > > Test: tier1 What happens if TLAB is at the end of heap page? Are you sure "prefetch" instructions on all OpenJDK platforms can touch unmapped memory? I see PPC can use `AllocatePrefetchStyle == 3`. Please ask all OpenJDK platforms supporters to test these changes. Note, when this code was introduced we did not have so many platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3517679023 PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3517685947 From kvn at openjdk.org Tue Nov 11 16:25:16 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 11 Nov 2025 16:25:16 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch In-Reply-To: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Tue, 11 Nov 2025 15:42:21 GMT, Albert Mingkun Yang wrote: > Trivial removing obsoleted code for unsupported arch. > > Test: tier1 You missed code in HS agent: `src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/VM.java` ------------- PR Review: https://git.openjdk.org/jdk/pull/28240#pullrequestreview-3448887549 From psandoz at openjdk.org Tue Nov 11 16:34:04 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Tue, 11 Nov 2025 16:34:04 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations In-Reply-To: <_ryF0SNpSLahH4HkGqSnGKc_6d9P1fWrKYTS0jRPvtk=.ff2143aa-d3a5-4776-bdd0-95646dfd35e9@github.com> References: <_ryF0SNpSLahH4HkGqSnGKc_6d9P1fWrKYTS0jRPvtk=.ff2143aa-d3a5-4776-bdd0-95646dfd35e9@github.com> Message-ID: On Tue, 11 Nov 2025 12:08:42 GMT, Emanuel Peter wrote: > We already have a lot of things in the codebase now from previous issues that use `HF` everywhere, for example some node names, and the type. Should we maybe rename all of them to `F16`, or something else? Open question, not sure of the answer yet. I was only referring to the Java code, esp. the new public classes so they align with the `Float16` element type. I do think it worthwhile to align so we are consistent across the platform. Revisiting the names in HotSpot, and their internal connection in Java, could be done in a separate PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3517758143 From epeter at openjdk.org Tue Nov 11 16:34:05 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 11 Nov 2025 16:34:05 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations In-Reply-To: References: <_ryF0SNpSLahH4HkGqSnGKc_6d9P1fWrKYTS0jRPvtk=.ff2143aa-d3a5-4776-bdd0-95646dfd35e9@github.com> Message-ID: On Tue, 11 Nov 2025 16:28:54 GMT, Paul Sandoz wrote: > Revisiting the names in HotSpot, and their internal connection in Java, could be done in a separate PR? Yes, exactly. Maybe even in a quick renaming PR before this issue. Would be quickly reviewed, and would allow us to see complete consistency going forward with this PR here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3517766354 From vklang at openjdk.org Tue Nov 11 17:33:29 2025 From: vklang at openjdk.org (Viktor Klang) Date: Tue, 11 Nov 2025 17:33:29 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 10:06:31 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: > > - Merge branch 'master' into JDK-8353835 > - Fix typo in test comment > - Merge branch 'master' into JDK-8353835 > - Merge branch 'master' into JDK-8353835 > - Suppress warnings from some tests > - Change -Xcheck:jni to be warning rather than fatal error > - Merge branch 'master' into JDK-8353835 > - Simplify filter > - Merge branch 'master' into JDK-8353835 > - Update Xcheck:jni description > - ... and 35 more: https://git.openjdk.org/jdk/compare/066810c8...6671ae02 src/java.base/share/classes/java/lang/reflect/doc-files/MutationMethods.html line 79: > 77: > 78: > 79: Many-body problem? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2515065507 From duke at openjdk.org Tue Nov 11 17:38:35 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Tue, 11 Nov 2025 17:38:35 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC Message-ID: [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. --- #### 1. Test Bug It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. --- #### 2. Implementation Bug `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. The fix ensures that all call sites are patched **before** the `nmethod` is registered. In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. ------------- Commit messages: - Clear inline caches before calling post_init - Fix relocations before registering nmethod - Add is_unloading() check before aquiring ic lock Changes: https://git.openjdk.org/jdk/pull/28241/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28241&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371046 Stats: 53 lines in 6 files changed: 28 ins; 21 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28241.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28241/head:pull/28241 PR: https://git.openjdk.org/jdk/pull/28241 From alanb at openjdk.org Tue Nov 11 17:48:43 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 11 Nov 2025 17:48:43 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v11] In-Reply-To: References: Message-ID: > Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). > > Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. > > HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). > > There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. > > Testing: tier1-6 Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: - Remove dup end body tag - Change FinalFieldMutationEvent so that caller is top frame in stack trace - Merge branch 'master' into JDK-8353835 - Review feedback: Add tests for setting internal properties, improve links in Mutation methods page - Merge branch 'master' into JDK-8353835 - Merge branch 'master' into JDK-8353835 - Fix typo in test comment - Merge branch 'master' into JDK-8353835 - Merge branch 'master' into JDK-8353835 - Suppress warnings from some tests - ... and 40 more: https://git.openjdk.org/jdk/compare/2902436f...b22947c7 ------------- Changes: https://git.openjdk.org/jdk/pull/25115/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25115&range=10 Stats: 4882 lines in 71 files changed: 4697 ins; 54 del; 131 mod Patch: https://git.openjdk.org/jdk/pull/25115.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25115/head:pull/25115 PR: https://git.openjdk.org/jdk/pull/25115 From alanb at openjdk.org Tue Nov 11 17:48:44 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 11 Nov 2025 17:48:44 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v7] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 12:37:59 GMT, Erik Gahlin wrote: > I can see the argument for not having the user's method as the top frame. A user may get a quick hint (instead of looking at the line number) if they see something like setInt(...), but this doesn?t work as well with tooling when you want to group stack traces by top frame, for example in a tree view. You typically want to see the application frame and then expand the nodes. If setInt, setFloat, setLong, etc. appear as the top nodes, users have to click and expand every setter, instead of seeing an aggregated list directly of packages, classes, or methods where finals are modified. @egahlin and I discussed this and agreed to have the top-frame of the stack trace recorded with the event be the caller's method. This allows the stack filter include j.l.r.Field with listing method names. We might revisit this later to add further fields to the event to indicate an unreflect op and/or the field type. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3518032863 From alanb at openjdk.org Tue Nov 11 17:48:45 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 11 Nov 2025 17:48:45 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: <-nBSdu-3kaHjMyaZTD8epppqP4EjowO5kvo3eTakJdg=.bec77fef-edc2-48b1-b961-d646fc994810@github.com> References: <1aQu6ywsFGh3TMN3XjBevjqmTFP7CdQRrbzQtuN-wfI=.814c0236-caa8-469a-9a42-dfafb56ebf64@github.com> <-nBSdu-3kaHjMyaZTD8epppqP4EjowO5kvo3eTakJdg=.bec77fef-edc2-48b1-b961-d646fc994810@github.com> Message-ID: On Tue, 11 Nov 2025 15:01:14 GMT, Chen Liang wrote: > we have plenty of time to restrict 2 and 3 together in future releases. There isn't any proposal to change JNI. It has never done any access checking. The only change is to -Xcheck:jni warning and logging to catch JNI code that is mutating finals. Once we dial up to have mutating finals be denied by default then we might dial up -Xcheck:jni at the same time to make it fatal. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3518048959 From alanb at openjdk.org Tue Nov 11 17:48:49 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 11 Nov 2025 17:48:49 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 20:12:59 GMT, Chen Liang wrote: >> Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits: >> >> - Merge branch 'master' into JDK-8353835 >> - Fix typo in test comment >> - Merge branch 'master' into JDK-8353835 >> - Merge branch 'master' into JDK-8353835 >> - Suppress warnings from some tests >> - Change -Xcheck:jni to be warning rather than fatal error >> - Merge branch 'master' into JDK-8353835 >> - Simplify filter >> - Merge branch 'master' into JDK-8353835 >> - Update Xcheck:jni description >> - ... and 35 more: https://git.openjdk.org/jdk/compare/066810c8...6671ae02 > > src/java.base/share/classes/java/lang/reflect/doc-files/MutationMethods.html line 72: > >> 70: illegal. >> 71: >> 72: The command line option --illegal-final-field-mutation controls how illegal > > Missing `

    `? Just a blank line, it's all the one paragraph. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2515095873 From ayang at openjdk.org Tue Nov 11 17:50:43 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 11 Nov 2025 17:50:43 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: > Trivial removing obsoleted code for unsupported arch. > > Test: tier1 Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28240/files - new: https://git.openjdk.org/jdk/pull/28240/files/03330ac2..0e447848 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28240&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28240&range=00-01 Stats: 9 lines in 2 files changed: 0 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28240.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28240/head:pull/28240 PR: https://git.openjdk.org/jdk/pull/28240 From ayang at openjdk.org Tue Nov 11 17:50:44 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 11 Nov 2025 17:50:44 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch In-Reply-To: <8ikwqN309ZPAORjL2YvE1hgvChrTfhi3slz1r4XIK5E=.41d37c1b-ae3e-43f5-9fe0-43ae2294e57c@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> <8ikwqN309ZPAORjL2YvE1hgvChrTfhi3slz1r4XIK5E=.41d37c1b-ae3e-43f5-9fe0-43ae2294e57c@github.com> Message-ID: On Tue, 11 Nov 2025 16:16:25 GMT, Vladimir Kozlov wrote: > What happens if TLAB is at the end of heap page? Are you sure "prefetch" instructions on all OpenJDK platforms can touch unmapped memory? I see PPC can use AllocatePrefetchStyle == 3. In every `.ad` file, there is `// Must be safe to execute with invalid address (cannot fault)` for prefetching. I searched online that all currently supported platforms don't fault on prefetching invalid addresses. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3518048288 From dhanalla at openjdk.org Tue Nov 11 18:11:20 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Tue, 11 Nov 2025 18:11:20 GMT Subject: RFR: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family [v2] In-Reply-To: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: > This PR makes two targeted AArch64 updates specific to Qualcomm silicon: > > 1. Corrects the CPU family enum name typo from CPU_QUALCOM to CPU_QUALCOMM. > 2. Enables UseSHA3Intrinsics for Qualcomm (CPU_QUALCOMM) in addition to Apple (CPU_APPLE), allowing Qualcomm-based systems to use hardware-optimized SHA?3 implementations. > > Performance testing: > The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs. > > > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" > xmlns="http://www.w3.org/TR/REC-html40"> > > > > > > > > > > >

    > >
    > >
    > >
    > > Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement > -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- > MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% > MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% > MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% > MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% > MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% > MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% > MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% > MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66% > > > >
    > >
    > >
    > >
    > > > > > Dhamoder Nalla has updated the pull request incrementally with two additional commits since the last revision: - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28166/files - new: https://git.openjdk.org/jdk/pull/28166/files/78c8d329..076f1d60 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28166&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28166&range=00-01 Stats: 9 lines in 3 files changed: 0 ins; 1 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/28166.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28166/head:pull/28166 PR: https://git.openjdk.org/jdk/pull/28166 From dhanalla at openjdk.org Tue Nov 11 18:11:21 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Tue, 11 Nov 2025 18:11:21 GMT Subject: RFR: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family In-Reply-To: References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: On Tue, 11 Nov 2025 14:10:21 GMT, Andrew Haley wrote: > Please describe the change in more detail. It doesn't appear to be Qualcomm-specific at all. Thanks @theRealAph for reviewing this PR. I've updated it with Qualcomm-specific changes and will create a separate PR for the other changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28166#issuecomment-3518163303 From kbarrett at openjdk.org Tue Nov 11 20:36:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 11 Nov 2025 20:36:38 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v7] In-Reply-To: References: Message-ID: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> > Please review this change that adds the type Atomic, to use as the type > of a variable that is accessed (including writes) concurrently by multiple > threads. This is intended to replace (most) uses of the current HotSpot idiom > of declaring a variable volatile and accessing that variable using functions > from the AtomicAccess class. > https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 > > This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are > > * Substantially restructured `Atomic`, to be IDE friendly. It's > operationally the same, with the same API, hence uses and gtests didn't need > to change in that respect. Thanks to @stefank for raising this issue, and for > some suggestions toward improvements. > > * Changed how fetch_then_set for atomic translated types is handled, to avoid > having the function there at all if it isn't usable, rather than just removing > it via SFINAE, leaving an empty overload set. > > * Added more gtests. > > Testing: mach5 tier1-6, GHA sanity tests Kim Barrett has updated the pull request incrementally with four additional commits since the last revision: - rename arithmetic operand type from I to Offset - use obviously different test types in HasExchange - remove single-use internal Base type aliases - fix missing order parameter usage ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27539/files - new: https://git.openjdk.org/jdk/pull/27539/files/da58d0d2..de91dbb7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27539&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27539&range=05-06 Stats: 40 lines in 1 file changed: 3 ins; 10 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/27539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27539/head:pull/27539 PR: https://git.openjdk.org/jdk/pull/27539 From kbarrett at openjdk.org Tue Nov 11 20:43:13 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 11 Nov 2025 20:43:13 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v6] In-Reply-To: References: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> Message-ID: On Tue, 11 Nov 2025 10:54:57 GMT, Stefan Karlsson wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: >> >> - Merge branch 'master' into atomic-template-tag-select >> - remove AtomicNextAccess and uses >> - use type_traits wrapper in new code >> - Merge branch 'master' into atomic-template-tag-select >> - more naming updates >> - rename relaxed_store => store_relaxed >> - default construct translated atomic without SFINAE >> - Merge branch 'master' into atomic-template-tag-select >> - Merge branch 'master' into atomic-template-tag-select >> - add reference to gcc bug we're working around >> - ... and 10 more: https://git.openjdk.org/jdk/compare/8ab17707...da58d0d2 > > src/hotspot/share/runtime/atomic.hpp line 36: > >> 34: // Atomic is used to declare a variable of type T with atomic access. >> 35: // >> 36: // The following value types T are supported: > > It would be nice to explain how enums fit into all this. In offline discussions that question was raised. It would be nice to get that clarified in this comment. There's a PrimitiveConversions::Translate specialization for enums. I'm not sure what more should be done than the existing reference to Translate specializations. It's not appropriate to list here all the types for which there are such specializations. Something like "such as enum types" could be added, but that feels like treating enum as more special here than I think it is. > src/hotspot/share/runtime/atomic.hpp line 105: > >> 103: // element arithmetic. >> 104: // >> 105: // (4) An atomic translated type additionally provides the exchange > > It is a little odd that `exchange` and `compare_exchange` behave differently here. Is that only because of how `AtomicAccess` is currently implemented? As explained later in this big comment, `exchange` differs from `compare_exchage` because `AtomicAccess` doesn't currently implement `xchg` for 1-byte values. Both the description and the implementation here would be simpler if it did. (That includes eliminating the whole conditional `exchange` for atomic translated types.) And I know of one or two places where an exchange of a bool value might be clearer than the current cmpxchg code. (Though `fetch_then_set` seems clearer still, at least to me. :) ) So maybe someone will add that support, but not in this PR. > src/hotspot/share/runtime/atomic.hpp line 282: > >> 280: template >> 281: class AtomicImpl::SupportsExchange : public CommonCore { >> 282: using Base = CommonCore; > > I don't see how this `using Base` aids in the readability. It is only used in the constructor, and there it just becomes an extra level of indirection. > > The same comment goes for the other `using Base` instances. The `Base` types had more uses in earlier versions. Removed. > src/hotspot/share/runtime/atomic.hpp line 300: > >> 298: >> 299: // Guarding the AtomicAccess calls with constexpr checking of I produces >> 300: // better compile-time error messages. > > It is unclear what `I` is meant to be a short name for. I don't think it is `integer` because we are also dealing with pointers. I don't think it can be `increment` because some functions pass the decrement amount. > > Maybe `V` (for value) could be more suitable here? But that could conflict with the contained _value in Atomic. > > In `check_i` you have this code and comment: > > } else if constexpr (std::is_signed_v) { > static_assert(std::is_signed_v, "value is signed but offset is unsigned"); > > > So maybe it could be `O` for offset? If `O` looks too much like `0` then maybe spell it out as `Offset`? > > Is there a collective name for "add value" and "sub value"? `I` _is_ short for `Integer`. That argument must be integral, even when the atomic value type is a pointer. (Atomic ptrdiff isn't a sensible operation.) Though I somehow dropped the check that it is indeed integral! Fixed that. Addition and subtraction have "addend" and "subtrahend" respectively, but there doesn't seem to be a great generalization of that. A couple AIs suggested "operand", but that's pretty uninformative. I decided to go with "offset". That was already being used in the error messages. More importantly, `Offset` is much easier to distinguish from `T` than is `I`. Updated to use that nomenclature consistently, so, for example, `check_i` -> `check_offset_type`. > src/hotspot/share/runtime/atomic.hpp line 442: > >> 440: public: >> 441: static constexpr bool value = std::is_pointer_v; >> 442: }; > > This little bit of time to understand and I tried to figure out why return `char*` or `char`, but then I see that it is only used in `is_pointer_v` test. Could this have been using `void*` and `void` instead, or does this need to use a type that can be instantiated? This is a variation on the so-called "sizeof trick", but using a type predicate rather than a sizeof comparison. I can never remember the syntax for the size!=1 return type, and have to search for an example every time I use that technique. I recently realized that distinguishing based on types avoided the uncommon syntax, and `decltype` + `` makes that easy. The use of `char` (and `char*`) was just a holdover from the typical usage in examples of the "sizeof trick"; nothing special there. I changed it to use obviously distinct types for the two cases - `void*` and `int`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2515693803 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2515695361 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2515716347 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2515717496 PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2515718233 From kbarrett at openjdk.org Tue Nov 11 20:43:15 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 11 Nov 2025 20:43:15 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v6] In-Reply-To: <3RMTU6vni7DQ91_94sZK3zlcN693NdxsF2ViuMcw478=.c90f9d4d-51db-40fa-a46f-0e58e813a6d0@github.com> References: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> <3RMTU6vni7DQ91_94sZK3zlcN693NdxsF2ViuMcw478=.c90f9d4d-51db-40fa-a46f-0e58e813a6d0@github.com> Message-ID: On Tue, 11 Nov 2025 11:32:48 GMT, Johan Sj?len wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: >> >> - Merge branch 'master' into atomic-template-tag-select >> - remove AtomicNextAccess and uses >> - use type_traits wrapper in new code >> - Merge branch 'master' into atomic-template-tag-select >> - more naming updates >> - rename relaxed_store => store_relaxed >> - default construct translated atomic without SFINAE >> - Merge branch 'master' into atomic-template-tag-select >> - Merge branch 'master' into atomic-template-tag-select >> - add reference to gcc bug we're working around >> - ... and 10 more: https://git.openjdk.org/jdk/compare/8ab17707...da58d0d2 > > src/hotspot/share/runtime/atomic.hpp line 276: > >> 274: T compare_exchange(T compare_value, T new_value, >> 275: atomic_memory_order order = memory_order_conservative) { >> 276: return AtomicAccess::cmpxchg(value_ptr(), compare_value, new_value); > > Bug: This isn't providing the `order` to `cmpxchg` Yikes! Thanks for spotting these. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2515723808 From kbarrett at openjdk.org Tue Nov 11 21:08:14 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 11 Nov 2025 21:08:14 GMT Subject: RFR: 8342769: HotSpot Windows/gcc port is broken [v17] In-Reply-To: References: Message-ID: <4s1iWdgqtuzO_x_kKixSh2jyDwTNL8xhcNl5YkLV79I=.d176c971-f767-414d-a525-68bc5d66a758@github.com> On Tue, 11 Nov 2025 13:01:23 GMT, Julian Waters wrote: >> Several areas in HotSpot are broken in the gcc port. These, with the exception of 1 rather big oversight within SharedRuntime::frem and SharedRuntime::drem, are all minor correctness issues within the code. These mostly can be fixed with simple changes to the code. Note that I am not sure whether the SharedRuntime::frem and SharedRuntime::drem fix is correct. It may be that they can be removed entirely > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: > > - Merge branch 'master' into hotspot > - CAST_FROM_FN_PTR in os_windows.cpp > - Merge branch 'master' into hotspot > - Merge branch 'openjdk:master' into hotspot > - _WINDOWS && AARCH64 in sharedRuntime.hpp > - AARCH64 in sharedRuntimeRem.cpp > - Refactor sharedRuntime.cpp > - CAST_FROM_FN_PTR in os_windows.cpp > - Merge branch 'openjdk:master' into hotspot > - fmod_winarm64 in sharedRuntime.cpp > - ... and 20 more: https://git.openjdk.org/jdk/compare/29100320...b93febb3 Some of these disparate changes are things I would approve of if they were offered separately. Others I think need more work. For example, for the sharedRuntime changes, is the fmod of infinity issue with Windows, or is it with Visual Studio. If the latter, the conditionalization of the workaround is wrong. Also, has a bug been filed with Microsoft for this? Or has it perhaps already been fixed in some version? Also, there are comments using "ARM64" but we use "AARCH64" in code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21627#issuecomment-3518725739 From eastigeevich at openjdk.org Tue Nov 11 21:41:34 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 11 Nov 2025 21:41:34 GMT Subject: RFR: 8371649: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation Message-ID: The instruction cache maintenance function internally handles any required barriers. This means we don't need any before calling it. This PR removes a redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation. ------------- Commit messages: - 8371649: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation Changes: https://git.openjdk.org/jdk/pull/28244/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28244&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371649 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28244/head:pull/28244 PR: https://git.openjdk.org/jdk/pull/28244 From duke at openjdk.org Tue Nov 11 22:09:24 2025 From: duke at openjdk.org (Ivan) Date: Tue, 11 Nov 2025 22:09:24 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: > Migrate away from pointer-based representation of Register values. > > It improves compile-time checking by forbidding implicit conversions between integrals and pointers. > > [JDK-8363943](https://bugs.openjdk.org/browse/JDK-8363943) Ivan has updated the pull request incrementally with one additional commit since the last revision: Proposed review changes were applied ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26525/files - new: https://git.openjdk.org/jdk/pull/26525/files/f75b381f..962b01b6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26525&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26525&range=00-01 Stats: 46 lines in 2 files changed: 0 ins; 4 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/26525.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26525/head:pull/26525 PR: https://git.openjdk.org/jdk/pull/26525 From duke at openjdk.org Tue Nov 11 22:37:04 2025 From: duke at openjdk.org (Ivan) Date: Tue, 11 Nov 2025 22:37:04 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 11:10:42 GMT, Aleksey Shipilev wrote: >> Ivan has updated the pull request incrementally with one additional commit since the last revision: >> >> Proposed review changes were applied > > src/hotspot/cpu/arm/register_arm.hpp line 86: > >> 84: enum { >> 85: number_of_registers = 16, >> 86: max_slots_per_register = 1 << (LogBytesPerWord - LogBytesPerInt) // LogBytesPerWord depends on _LP64 > > ARM32 is only 32-bit, so we can skip any _LP64-based computations, and just do the literal constant. Yes, of course. One thing concerns me, according to the globalDefinitions.hpp, max_slots_per_register evaluates to 1, but in one of the other comments you mentioned that it is 2. Did I misunderstood the definitions, or there is a mistake? const int LogBytesPerInt = 2; #ifdef _LP64 constexpr int LogBytesPerWord = 3; #else constexpr int LogBytesPerWord = 2; #endif > src/hotspot/cpu/arm/register_arm.hpp line 101: > >> 99: >> 100: // testers >> 101: bool is_valid() const {return 0 <= raw_encoding() && raw_encoding() < number_of_registers;} > > Suggestion: > > // accessors and testers > int raw_encoding() const { return this - first(); } > int encoding() const { assert(is_valid(), "invalid register"); return raw_encoding(); } > bool is_valid() const { return 0 <= raw_encoding() && raw_encoding() < number_of_registers; } Applied in `962b01b602c0f42b95f8a3ad4f58d84b17db3c6f` commit > src/hotspot/cpu/arm/register_arm.hpp line 202: > >> 200: >> 201: // testers >> 202: bool is_valid() const {return 0 <= raw_encoding() && raw_encoding() < number_of_registers;} > > Suggestion: > > // accessors and testers > int raw_encoding() const { return this - first(); } > int encoding() const { assert(is_valid(), "invalid register"); return raw_encoding(); } > bool is_valid() const { return 0 <= raw_encoding() && raw_encoding() < number_of_registers; } Applied in `962b01b602c0f42b95f8a3ad4f58d84b17db3c6f` commit ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516017292 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516021112 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516021904 From duke at openjdk.org Tue Nov 11 22:45:06 2025 From: duke at openjdk.org (Ivan) Date: Tue, 11 Nov 2025 22:45:06 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 11:17:15 GMT, Aleksey Shipilev wrote: >> Ivan has updated the pull request incrementally with one additional commit since the last revision: >> >> Proposed review changes were applied > > src/hotspot/cpu/arm/register_arm.hpp line 152: > >> 150: constexpr Register R9 = as_Register(9); >> 151: constexpr Register R10 = as_Register(10); >> 152: constexpr Register R11 = as_Register(11); > > Indent these like: > > > constexpr Register R8 = as_Register( 8); > constexpr Register R9 = as_Register( 9); > constexpr Register R10 = as_Register(10); > constexpr Register R11 = as_Register(11); Applied in `962b01b602c0f42b95f8a3ad4f58d84b17db3c6f` commit > src/hotspot/cpu/arm/register_arm.hpp line 264: > >> 262: constexpr FloatRegister S4_reg = as_FloatRegister(4); >> 263: constexpr FloatRegister S5_reg = as_FloatRegister(5); >> 264: constexpr FloatRegister S6_reg = as_FloatRegister(6); > > Take a chance on renaming these `S${X}_reg` to just `S${X}`? I spot-checked their usages, and there are only a few places that need adjustments. At the top of these definitions there is a comment /* * S1-S6 are named with "_reg" suffix to avoid conflict with * constants defined in sharedRuntimeTrig.cpp */ ``` And the definitions from `sharedRuntimeTrig.cpp` are still there ``` static const double S1 = -1.66666666666666324348e-01, /* 0xBFC55555, 0x55555549 */ S2 = 8.33333333332248946124e-03, /* 0x3F811111, 0x1110F8A6 */ S3 = -1.98412698298579493134e-04, /* 0xBF2A01A0, 0x19C161D5 */ S4 = 2.75573137070700676789e-06, /* 0x3EC71DE3, 0x57B1FE7D */ S5 = -2.50507602534068634195e-08, /* 0xBE5AE5E6, 0x8A2B9CEB */ S6 = 1.58969099521155010221e-10; /* 0x3DE5D93A, 0x5ACFD57C */ ``` Should I ignore it and change `S${X}_reg` to `S${X}`? Or I shall avoid the conflict some other way? > src/hotspot/cpu/arm/register_arm.hpp line 265: > >> 263: constexpr FloatRegister S5_reg = as_FloatRegister(5); >> 264: constexpr FloatRegister S6_reg = as_FloatRegister(6); >> 265: constexpr FloatRegister S7 = as_FloatRegister(7); > > Also, indent this like: > > > constexpr FloatRegister S8 = as_FloatRegister( 8); > constexpr FloatRegister S9 = as_FloatRegister( 9); > constexpr FloatRegister S10 = as_FloatRegister(10); > constexpr FloatRegister S11 = as_FloatRegister(11); Applied in `962b01b602c0f42b95f8a3ad4f58d84b17db3c6f` commit > src/hotspot/cpu/arm/register_arm.hpp line 427: > >> 425: constexpr VFPSystemRegister FPSCR = as_VFPSystemRegister( 1); >> 426: constexpr VFPSystemRegister MVFR0 = as_VFPSystemRegister(0x6); >> 427: constexpr VFPSystemRegister MVFR1 = as_VFPSystemRegister(0x7); > > You can use `VFPSystemRegister` enum values as arguments here, correct? Like: > > > constexpr VFPSystemRegister MVFR1 = as_VFPSystemRegister(VFPSystemRegister::MVFR1); Yes, that looks much more clear, applied in `962b01b602c0f42b95f8a3ad4f58d84b17db3c6f` commit > src/hotspot/cpu/arm/vmreg_arm.hpp line 52: > >> 50: return (value() % Register::max_slots_per_register == 0); >> 51: } else if (is_FloatRegister()) { >> 52: return true; // Single slot > > I guess. But for safety, we can still do `% FloatRegister::max_slot_per_register == 0`, just in case we ever need to adjust it? Sure, sounds reasonable, applied in `962b01b602c0f42b95f8a3ad4f58d84b17db3c6f` commit ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516040914 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516039154 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516039986 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516048503 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516045497 From duke at openjdk.org Tue Nov 11 22:53:07 2025 From: duke at openjdk.org (Ivan) Date: Tue, 11 Nov 2025 22:53:07 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 12:52:33 GMT, Aleksey Shipilev wrote: >> Ivan has updated the pull request incrementally with one additional commit since the last revision: >> >> Proposed review changes were applied > > src/hotspot/cpu/arm/register_arm.hpp line 187: > >> 185: enum { >> 186: number_of_registers = NOT_COMPILER2(32) COMPILER2_PRESENT(64), >> 187: max_slots_per_register = 1 > > Can you double-check it is really `1`? For GPRs, we have `max_slots_per_register` at effectively `2`. The change is consistent with previous version. Previously the value was calculated in `src/hotspot/cpu/arm/register_arm.hpp` inside ConcreteRegisterImpl class like this: #ifdef COMPILER2 log_bytes_per_fpr = 2, // quad vectors #else log_bytes_per_fpr = 2, // double vectors #endif ... log_vmregs_per_fpr = log_bytes_per_fpr - LogBytesPerInt, ... vmregs_per_fpr = 1 << log_vmregs_per_fpr, `LogBytesPerInt` in `globalDefinitions.hpp` is 2, so the `vmregs_per_fpr` always evaluated to 1. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2516070032 From jwaters at openjdk.org Wed Nov 12 00:50:13 2025 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 12 Nov 2025 00:50:13 GMT Subject: RFR: 8342769: HotSpot Windows/gcc port is broken [v17] In-Reply-To: <4s1iWdgqtuzO_x_kKixSh2jyDwTNL8xhcNl5YkLV79I=.d176c971-f767-414d-a525-68bc5d66a758@github.com> References: <4s1iWdgqtuzO_x_kKixSh2jyDwTNL8xhcNl5YkLV79I=.d176c971-f767-414d-a525-68bc5d66a758@github.com> Message-ID: On Tue, 11 Nov 2025 21:05:42 GMT, Kim Barrett wrote: > Some of these disparate changes are things I would approve of if they were offered separately. Others I think need more work. > > For example, for the sharedRuntime changes, is the fmod of infinity issue with Windows, or is it with Visual Studio. If the latter, the conditionalization of the workaround is wrong. Also, has a bug been filed with Microsoft for this? Or has it perhaps already been fixed in some version? > > Also, there are comments using "ARM64" but we use "AARCH64" in code. @swesonga filed a bug for this, not sure if this is what you're referring to? https://developercommunity.visualstudio.com/t/fmod-incorrectly-returns-NaN-on-certain-/10793176 I think this might be an issue with Visual Studio rather than Windows itself. Admittedly I really don't like all the conditionals either, and am hoping for it to be fixed in VS soon so I can delete the workaround entirely, but there hasn't been any news on whether this has been fixed yet or not. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21627#issuecomment-3519355244 From kbarrett at openjdk.org Wed Nov 12 02:36:06 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 12 Nov 2025 02:36:06 GMT Subject: RFR: 8342769: HotSpot Windows/gcc port is broken [v17] In-Reply-To: References: <4s1iWdgqtuzO_x_kKixSh2jyDwTNL8xhcNl5YkLV79I=.d176c971-f767-414d-a525-68bc5d66a758@github.com> Message-ID: On Wed, 12 Nov 2025 00:47:46 GMT, Julian Waters wrote: > > Some of these disparate changes are things I would approve of if they were offered separately. Others I think need more work. > > For example, for the sharedRuntime changes, is the fmod of infinity issue with Windows, or is it with Visual Studio. If the latter, the conditionalization of the workaround is wrong. Also, has a bug been filed with Microsoft for this? Or has it perhaps already been fixed in some version? > > Also, there are comments using "ARM64" but we use "AARCH64" in code. > > @swesonga filed a bug for this, not sure if this is what you're referring to? https://developercommunity.visualstudio.com/t/fmod-incorrectly-returns-NaN-on-certain-/10793176 Yes, that's what I was looking for. Thanks for the pointer. > I think this might be an issue with Visual Studio rather than Windows itself. Admittedly I really don't like all the conditionals either, and am hoping for it to be fixed in VS soon so I can delete the workaround entirely, but there hasn't been any news on whether this has been fixed yet or not. Does gcc on Windows have the same problem? I'm guessing not. This seems almost certainly to be a VS problem, either with the generated code or (more likely) with a runtime support function. Of course, with that in mind we can't remove the workaround until we've moved the minimum supported version forward past the bug fix. That might be a while. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21627#issuecomment-3519606600 From kbarrett at openjdk.org Wed Nov 12 04:13:03 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 12 Nov 2025 04:13:03 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v7] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 10:13:44 GMT, Afshin Zafari wrote: >> Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. >> >> Tests: >> mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > Windows warning bypassed src/hotspot/share/oops/klass.hpp line 515: > 513: > 514: // VS warns (C4146) about unary minus of unsigned. > 515: PRAGMA_DISABLE_MSVC_WARNING(4146) It seems that some time ago (JDK-8254072) 4146 was disabled for JVM MSVC build: https://github.com/openjdk/jdk/blame/8531fa146be1da5e96c0f23091882a27c67d7893/make/hotspot/lib/CompileJvm.gmk#L117 So the warning isn't a concern after all. Sorry for the false alarm and misleading guidance. This was also not the correct way to introduce the warning suppression had it been needed, but that's moot. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2516670225 From jsjolen at openjdk.org Wed Nov 12 05:55:03 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 12 Nov 2025 05:55:03 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v7] In-Reply-To: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> References: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> Message-ID: On Tue, 11 Nov 2025 20:36:38 GMT, Kim Barrett wrote: >> Please review this change that adds the type Atomic, to use as the type >> of a variable that is accessed (including writes) concurrently by multiple >> threads. This is intended to replace (most) uses of the current HotSpot idiom >> of declaring a variable volatile and accessing that variable using functions >> from the AtomicAccess class. >> https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 >> >> This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are >> >> * Substantially restructured `Atomic`, to be IDE friendly. It's >> operationally the same, with the same API, hence uses and gtests didn't need >> to change in that respect. Thanks to @stefank for raising this issue, and for >> some suggestions toward improvements. >> >> * Changed how fetch_then_set for atomic translated types is handled, to avoid >> having the function there at all if it isn't usable, rather than just removing >> it via SFINAE, leaving an empty overload set. >> >> * Added more gtests. >> >> Testing: mach5 tier1-6, GHA sanity tests > > Kim Barrett has updated the pull request incrementally with four additional commits since the last revision: > > - rename arithmetic operand type from I to Offset > - use obviously different test types in HasExchange > - remove single-use internal Base type aliases > - fix missing order parameter usage Marked as reviewed by jsjolen (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27539#pullrequestreview-3451651518 From kbarrett at openjdk.org Wed Nov 12 06:03:47 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 12 Nov 2025 06:03:47 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions Message-ID: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions Please review this change that adds `cppstdlib/new.hpp` as a wrapper for including ``. All existing inclusions of `` are changed to include the new wrapper. In additional to including ``, this wrapper also provides deprecation declarations to prevent the use of some facilities by HotSpot code. However, those deprecations need to be conditionalized to not apply to gtests, so this change also adds a macro definition provided by the build system for use in detecting that a header is being included by a gtest. Testing: mach5 tier1 ------------- Commit messages: - add wrapper for Changes: https://git.openjdk.org/jdk/pull/28250/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28250&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8369187 Stats: 165 lines in 15 files changed: 143 ins; 22 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28250.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28250/head:pull/28250 PR: https://git.openjdk.org/jdk/pull/28250 From kbarrett at openjdk.org Wed Nov 12 06:21:13 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 12 Nov 2025 06:21:13 GMT Subject: RFR: 8370333: hotspot-unit-tests.md specifies wrong directory structure for tests Message-ID: Please review this change to the HotSpot unit test documentation, fixing the path where native tests are located. ------------- Commit messages: - fix path for native test location Changes: https://git.openjdk.org/jdk/pull/28251/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28251&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370333 Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/28251.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28251/head:pull/28251 PR: https://git.openjdk.org/jdk/pull/28251 From aboldtch at openjdk.org Wed Nov 12 06:31:06 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 12 Nov 2025 06:31:06 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v7] In-Reply-To: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> References: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> Message-ID: On Tue, 11 Nov 2025 20:36:38 GMT, Kim Barrett wrote: >> Please review this change that adds the type Atomic, to use as the type >> of a variable that is accessed (including writes) concurrently by multiple >> threads. This is intended to replace (most) uses of the current HotSpot idiom >> of declaring a variable volatile and accessing that variable using functions >> from the AtomicAccess class. >> https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 >> >> This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are >> >> * Substantially restructured `Atomic`, to be IDE friendly. It's >> operationally the same, with the same API, hence uses and gtests didn't need >> to change in that respect. Thanks to @stefank for raising this issue, and for >> some suggestions toward improvements. >> >> * Changed how fetch_then_set for atomic translated types is handled, to avoid >> having the function there at all if it isn't usable, rather than just removing >> it via SFINAE, leaving an empty overload set. >> >> * Added more gtests. >> >> Testing: mach5 tier1-6, GHA sanity tests > > Kim Barrett has updated the pull request incrementally with four additional commits since the last revision: > > - rename arithmetic operand type from I to Offset > - use obviously different test types in HasExchange > - remove single-use internal Base type aliases > - fix missing order parameter usage Looks good. The gtest still uses a mix of the x86 instruction and the new `compare_exchange` / `exchange` nomenclature. Feel free to unify this, or leave it as is (can always fix this later). ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27539#pullrequestreview-3451797476 From stefank at openjdk.org Wed Nov 12 07:24:05 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 12 Nov 2025 07:24:05 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v6] In-Reply-To: References: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> Message-ID: On Tue, 11 Nov 2025 20:32:35 GMT, Kim Barrett wrote: >> src/hotspot/share/runtime/atomic.hpp line 36: >> >>> 34: // Atomic is used to declare a variable of type T with atomic access. >>> 35: // >>> 36: // The following value types T are supported: >> >> It would be nice to explain how enums fit into all this. In offline discussions that question was raised. It would be nice to get that clarified in this comment. > > There's a PrimitiveConversions::Translate specialization for enums. I'm not > sure what more should be done than the existing reference to Translate > specializations. It's not appropriate to list here all the types for which > there are such specializations. Something like "such as enum types" could be > added, but that feels like treating enum as more special here than I think it is. People know what enums are but there's only a few people that know what PrimitiveConversions::Translate is. Hence my comment. Anyways, I know what PrimitiveConversions::Translate is and I've given my feedback that the comments doesn't help the readers that don't understand this understand how enums fit in. I'll leave it to others react if they don't understand. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2517192118 From stefank at openjdk.org Wed Nov 12 07:31:13 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 12 Nov 2025 07:31:13 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v7] In-Reply-To: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> References: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> Message-ID: On Tue, 11 Nov 2025 20:36:38 GMT, Kim Barrett wrote: >> Please review this change that adds the type Atomic, to use as the type >> of a variable that is accessed (including writes) concurrently by multiple >> threads. This is intended to replace (most) uses of the current HotSpot idiom >> of declaring a variable volatile and accessing that variable using functions >> from the AtomicAccess class. >> https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 >> >> This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are >> >> * Substantially restructured `Atomic`, to be IDE friendly. It's >> operationally the same, with the same API, hence uses and gtests didn't need >> to change in that respect. Thanks to @stefank for raising this issue, and for >> some suggestions toward improvements. >> >> * Changed how fetch_then_set for atomic translated types is handled, to avoid >> having the function there at all if it isn't usable, rather than just removing >> it via SFINAE, leaving an empty overload set. >> >> * Added more gtests. >> >> Testing: mach5 tier1-6, GHA sanity tests > > Kim Barrett has updated the pull request incrementally with four additional commits since the last revision: > > - rename arithmetic operand type from I to Offset > - use obviously different test types in HasExchange > - remove single-use internal Base type aliases > - fix missing order parameter usage Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27539#pullrequestreview-3451997981 From stefank at openjdk.org Wed Nov 12 07:31:14 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 12 Nov 2025 07:31:14 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v6] In-Reply-To: References: <-MCsPbrh9d8AM2XMQnHNHoBdkCldjZcOPi0zdb-SIM8=.3e0fa6ff-29eb-4912-8039-faab6692ff0e@github.com> Message-ID: On Tue, 11 Nov 2025 20:39:13 GMT, Kim Barrett wrote: > That argument must be integral, even when the atomic value type is a pointer. Yeah. Then `I` was fine. I'm also OK with `Offset`. Choose what you think is most appropriate and I'll Review OK it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27539#discussion_r2517201809 From stefank at openjdk.org Wed Nov 12 07:43:06 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 12 Nov 2025 07:43:06 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions In-Reply-To: References: Message-ID: <6SO4sjrZNkftwWmo-9j7fPvB5K7e24kpCqt5UzSWWqQ=.bbb0560f-2e54-4f68-a67f-10fa35bf0afb@github.com> On Wed, 12 Nov 2025 05:49:27 GMT, Kim Barrett wrote: > 8369187: Add wrapper for that forbids use of global allocation and deallocation functions > > Please review this change that adds `cppstdlib/new.hpp` as a wrapper for > including ``. All existing inclusions of `` are changed to include > the new wrapper. > > In additional to including ``, this wrapper also provides deprecation > declarations to prevent the use of some facilities by HotSpot code. > > However, those deprecations need to be conditionalized to not apply to gtests, > so this change also adds a macro definition provided by the build system for > use in detecting that a header is being included by a gtest. > > Testing: mach5 tier1 src/hotspot/share/cppstdlib/new.hpp line 79: > 77: // Visual Studio => error C2370: '...': redefinition; different storage class > 78: #ifndef TARGET_COMPILER_visCPP > 79: [[deprecated]] extern const size_t hardware_destructive_interference_size; At cppreference this is declared as: inline constexpr size_t hardware_destructive_interference_size Is that why you're getting the Visual Studio error? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28250#discussion_r2517243907 From stefank at openjdk.org Wed Nov 12 07:44:01 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 12 Nov 2025 07:44:01 GMT Subject: RFR: 8370333: hotspot-unit-tests.md specifies wrong directory structure for tests In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 06:14:47 GMT, Kim Barrett wrote: > Please review this change to the HotSpot unit test documentation, fixing the > path where native tests are located. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28251#pullrequestreview-3452048463 From ysuenaga at openjdk.org Wed Nov 12 08:00:30 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Wed, 12 Nov 2025 08:00:30 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM [v2] In-Reply-To: References: Message-ID: > When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [linux-vdso.so.1+0xe69] > [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] > > Retrying call stack printing without source information... > > [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] > > > When I checked back trace on GDB, it failed at `assert`. > > #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", > line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", > detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 > > > > (gdb) f 13 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > 536 assert(false, "section header string table should be loaded"); > > > vDSO is not a regular ELF, so it should be skipped here. Yasumasa Suenaga has updated the pull request incrementally with two additional commits since the last revision: - Undo unnecessary change - Check the result of opening file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28102/files - new: https://git.openjdk.org/jdk/pull/28102/files/afa88a0a..678d57ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28102&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28102&range=00-01 Stats: 12 lines in 1 file changed: 5 ins; 7 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28102.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28102/head:pull/28102 PR: https://git.openjdk.org/jdk/pull/28102 From jbhateja at openjdk.org Wed Nov 12 08:03:04 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 12 Nov 2025 08:03:04 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 00:59:25 GMT, Joe Darcy wrote: > > Some quick comments. > > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. > > I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. There are nomenclature issues that I am facing. Currently, all the Float16 concrete classes use the Halffloat prefix i.e., Halffloat64Vector, Halffloat128Vector; converting these to Float16 looks a little confusing, i.e., Float1664Vector, Float16128Vector, etc Kindly suggest a better name to represent these classes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3520534564 From jbhateja at openjdk.org Wed Nov 12 08:03:02 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 12 Nov 2025 08:03:02 GMT Subject: RFR: 8370691: Add new HalffloatVector type and enable intrinsification of float16 vector operations In-Reply-To: References: <_ryF0SNpSLahH4HkGqSnGKc_6d9P1fWrKYTS0jRPvtk=.ff2143aa-d3a5-4776-bdd0-95646dfd35e9@github.com> Message-ID: On Tue, 11 Nov 2025 16:28:54 GMT, Paul Sandoz wrote: >> We already have a lot of things in the codebase now from previous issues that use `HF` everywhere, for example some node names, and the type. Should we maybe rename all of them to `F16`, or something else? Open question, not sure of the answer yet. > >> We already have a lot of things in the codebase now from previous issues that use `HF` everywhere, for example some node names, and the type. Should we maybe rename all of them to `F16`, or something else? Open question, not sure of the answer yet. > > I was only referring to the Java code, esp. the new public classes so they align with the `Float16` element type. I do think it worthwhile to align so we are consistent across the platform. Revisiting the names in HotSpot, and their internal connection in Java, could be done in a separate PR? Hi @PaulSandoz , Thanks for your comments. Please find below my responses. > When you generate the fallback code for unary/binary etc can you push the carrier type and conversations into the uOp/bOp implementations so you don't have to explicitly operate on the carrier type and do the conversions as you do now e.g.,: > > ``` > v0.uOp(m, (i, a) -> float16ToShortBits(Float16.valueOf(-(shortBitsToFloat16(($type$)a).floatValue())))); > ``` Currently, uOp and uOpTemplates are part of the scaffolding logic and are sacrosanct; they are shared by various abstracted vector classes, and their semantics are defined by the lambda expression. I agree that explicit conversion in lambdas looks verbose, but moving them to uOpTemplate may fracture the lambda expression such that part of its semantics, i.e,. conversions, will seep into uOpTemplate, while what will appear at the surface will be the expression operating over primitive float values; this may become very confusing. > > The transition of intrinsic arguments from `vsp.elementType()` to `vsp.carrierType(), vsp.operType()` is a little unfortunate. Is this because HotSpot cannot directly refer to the `Float16` class from the incubating module? Yes, the idea here was to clearly differentiate b/w elemType and carrierType and avoid passing Float16.class as an argument to intrinsic entry points. Unlike the VectorSupport class, Float16 is part of the incubating module and cannot be directly exposed to VM, i.e., we cannot create a vmSymbol for it during initialization. This would have made all the lane type checks in-line expand name-based rather than efficient symbol lookup. > Requiring two arguments means they can get out of sync. Previously the class provided all the information needed, now > arguably the type does. Yes, from the compiler standpoint point all we care about is the carrier type, which determines the vector lane size. This is augmented with operation kind (PRIM / FP16) to differentiate a short vector lane from a float16 vector lane. Apart from this, we need to pass the VectorBox type to wrap the vector IR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3520530639 From ysuenaga at openjdk.org Wed Nov 12 08:05:06 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Wed, 12 Nov 2025 08:05:06 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM [v2] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 12:03:01 GMT, Johan Sj?len wrote: >> Yasumasa Suenaga has updated the pull request incrementally with two additional commits since the last revision: >> >> - Undo unnecessary change >> - Check the result of opening file > > Hi, > > The actual issue here is that the vDSO file doesn't exist, so can't be opened, right? Then, we should instead check in `get_elf_file` whether the creation of the `ElfFile` succeeded instead. This should be done anyway, the `vDSO` issue is just a symptom of another bug. > > I think that this new definition should be used instead, in `decoder_elf.cpp:104`. > > ```c++ > ElfFile* ElfDecoder::get_elf_file(const char* filepath) { > ElfFile* file; > > file = _opened_elf_files; > while (file != nullptr) { > if (file->same_elf_file(filepath)) { > return file; > } > file = file->next(); > } > > file = new (std::nothrow) ElfFile(filepath); > if (file == nullptr) { > return nullptr; > } else if (file->get_status() != NullDecoder::no_error) { > return nullptr; > } > > > if (_opened_elf_files != nullptr) { > file->set_next(_opened_elf_files); > } > _opened_elf_files = file; > > return file; > } > > > What do you think? @jdksjolen > The actual issue here is that the vDSO file doesn't exist, so can't be opened, right? Then, we should instead check in get_elf_file whether the creation of the ElfFile succeeded instead. This should be done anyway, the vDSO issue is just a symptom of another bug. Yes, you are right. I added a check whether the library could be opened. It works for vDSO. I believe libraries which do not exist on file system would not happen except vDSO, but the issue relates to open ELF files could happen by some system breaking (filesystem broken, memory exhausted, and any other unexpected system failure). This change could cover troubles relates to the file. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28102#issuecomment-3520542439 From alanb at openjdk.org Wed Nov 12 08:21:12 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 12 Nov 2025 08:21:12 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v10] In-Reply-To: <-nBSdu-3kaHjMyaZTD8epppqP4EjowO5kvo3eTakJdg=.bec77fef-edc2-48b1-b961-d646fc994810@github.com> References: <1aQu6ywsFGh3TMN3XjBevjqmTFP7CdQRrbzQtuN-wfI=.814c0236-caa8-469a-9a42-dfafb56ebf64@github.com> <-nBSdu-3kaHjMyaZTD8epppqP4EjowO5kvo3eTakJdg=.bec77fef-edc2-48b1-b961-d646fc994810@github.com> Message-ID: On Tue, 11 Nov 2025 15:01:14 GMT, Chen Liang wrote: > > It's aligned with setAccessible. It's corner case of course but if a JNI attached thread invokes setAccessible with no java frames on the stack, then it is specified to only succeed if the API element is public and declared in a public class in an exported package. > > Consider setting the field `java.lang.constant.DirectMethodHandleDesc$Kind.refKind` (public final instance field in public class in exported, non-open package) in 3 ways: > > 1. `Field.setAccessible` + `set` in Java code: Now `set` fails without `--add-opens` (not open) There is a spec issue here. Field.set on final instance fields should align with setAccessible. So assuming setAccessible has succeeded, final field mutation for the caller module is enabled, then Field.set should be specified to succeed when the field is public and its declaring class is public and in a package that its module exports "statically" to the caller module. Right now, we specify that the package must be statically open to the caller which is more than setAccessible requires for this case. I agree this may be surprising. I've drafted spec (and implementation) changes to align them but I want to check with Alex and Ron as it doing it now would require changing a line in the JEP too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3520598996 From epeter at openjdk.org Wed Nov 12 08:33:29 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 12 Nov 2025 08:33:29 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: References: Message-ID: On Thu, 2 Oct 2025 09:08:06 GMT, Roland Westrelin wrote: >> This is a variant of 8332827. In 8332827, an array access becomes >> dependent on a range check `CastII` for another array access. When, >> after loop opts are over, that RC `CastII` was removed, the array >> access could float and an out of bound access happened. With the fix >> for 8332827, RC `CastII`s are no longer removed. >> >> With this one what happens is that some transformations applied after >> loop opts are over widen the type of the RC `CastII`. As a result, the >> type of the RC `CastII` is no longer narrower than that of its input, >> the `CastII` is removed and the dependency is lost. >> >> There are 2 transformations that cause this to happen: >> >> - after loop opts are over, the type of the `CastII` nodes are widen >> so nodes that have the same inputs but a slightly different type can >> common. >> >> - When pushing a `CastII` through an `Add`, if of the type both inputs >> of the `Add`s are non constant, then we end up widening the type >> (the resulting `Add` has a type that's wider than that of the >> initial `CastII`). >> >> There are already 3 types of `Cast` nodes depending on the >> optimizations that are allowed. Either the `Cast` is floating >> (`depends_only_test()` returns `true`) or pinned. Either the `Cast` >> can be removed if it no longer narrows the type of its input or >> not. We already have variants of the `CastII`: >> >> - if the Cast can float and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and can't be removed when it doesn't narrow >> the type of its input. >> >> What we need here, I think, is the 4th combination: >> >> - if the Cast can float and can't be removed when it doesn't narrow >> the type of its input. >> >> Anyway, things are becoming confusing with all these different >> variants named in ways that don't always help figure out what >> constraints one of them operate under. So I refactored this and that's >> the biggest part of this change. The fix consists in marking `Cast` >> nodes when their type is widen in a way that prevents them from being >> optimized out. >> >> Tobias ran performance testing with a slightly different version of >> this change and there was no regression. > > Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: > > - review > - infinite loop in gvn fix > - renaming @rwestrel Sorry I dropped the review on this one for a long time :/ I left quite a few comments. But on the whole I'm really happy with the direction you are taking. It's getting much clearer. I would still see some more clear explanations/comments. That way, we can make our previously implicit assumptions even more explicit :) src/hotspot/share/opto/castnode.cpp line 47: > 45: Node* ConstraintCastNode::Identity(PhaseGVN* phase) { > 46: if (!_dependency.narrows_type()) { > 47: return this; Can you please add a code comment? I don't understand it right away :/ src/hotspot/share/opto/castnode.cpp line 153: > 151: if (!_dependency.narrows_type()) { > 152: return nullptr; > 153: } Interesting, we already check that at at least some of the use sites. If it turns out we already do it at all use sites, why not just assert? (maybe not possible or desirable, just an idea) A comment here would also be great. src/hotspot/share/opto/castnode.cpp line 277: > 275: > 276: CastIINode* CastIINode::pin_array_access_node() const { > 277: assert(depends_only_on_test(), "already pinned"); Would this not be more readable? Suggestion: assert(is_dependency_floating(), "already pinned"); src/hotspot/share/opto/castnode.cpp line 588: > 586: > 587: // If both inputs are not constant then, with the Cast pushed through the Add/Sub, the cast gets less precised types, > 588: // and the resulting Add/Sub's type is wider than that of the Cast before pushing. I find this long sentence a bit complicated to read. Can you reformulate and maybe break it into smaller sentences? It would also be good to explicitly say why that may require changing the dependency constraint. src/hotspot/share/opto/castnode.cpp line 615: > 613: // Widening the type of the Cast (to allow some commoning) causes the Cast to change how it can be optimized (if > 614: // type of its input is narrower than the Cast's type, we can't remove it to not loose the dependency). > 615: return make_with(in(1), wide_t, _dependency.widen_type_dependency()); Suggestion: return make_with(in(1), wide_t, _dependency.with_non_narrowing()); This may be clearer here, since non-narrowing prevents folding the cast away if the input is narrower. I like the code comment you already have though :) src/hotspot/share/opto/castnode.cpp line 625: > 623: if (!phase->C->post_loop_opts_phase()) { > 624: return this_type; > 625: } Honestly, I would prefer to see this "delay to post loop opts" to be done outside of `widen_type`. It would just make more sense there. What do you think? src/hotspot/share/opto/castnode.hpp line 46: > 44: // 1- and 2- are not always applied depending on what constraint are applied to the Cast: there are cases where 1- > 45: // and 2- apply, where neither 1- nor 2- apply and where one or the other apply. This class abstract away these > 46: // details. Can you spell it out a little more? Right now it feels a little bit like an "exercise for the reader". For each optimization, what is required of the constraints? I think that would help the reader. Equally: you could name why those constraints are required in the first place. Or is there some other place we could link to that already has those explanations? src/hotspot/share/opto/castnode.hpp line 53: > 51: _narrows_type(narrows_type), > 52: _desc(desc) { > 53: } Could you make the constructor private, and only expose the 4 static fields? That way, nobody comes to the strange idea to construct one of these themselves ;) src/hotspot/share/opto/castnode.hpp line 62: > 60: bool narrows_type() const { > 61: return _narrows_type; > 62: } Nits about naming: I would prefer `is_` for boolean queries. Otherwise, if I look at the names `floating` and `pinned_dependency`, I don't immediately know which one converts to a floating/non-floating, and which one is a boolean query. Maybe `pinned_dependency` should be renamed to `with_pinned_dependency`. src/hotspot/share/opto/castnode.hpp line 65: > 63: void dump_on(outputStream *st) const { > 64: st->print("%s", _desc); > 65: } Suggestion: bool narrows_type() const { return _narrows_type; } void dump_on(outputStream *st) const { st->print("%s", _desc); } Newline for consistency with surrounding code. src/hotspot/share/opto/castnode.hpp line 92: > 90: const bool _floating; // Does this Cast depends on its control input or is it pinned? > 91: const bool _narrows_type; // Does this Cast narrows the type i.e. if input type is narrower can it be removed? > 92: const char* _desc; I thought the hotspot convention was to usually put the fields first, at the top of the class? src/hotspot/share/opto/castnode.hpp line 104: > 102: // NonFloatingNarrowingDependency is used when an array access is no longer dependent on a single range check (range > 103: // check smearing for instance) > 104: // FloatingNonNarrowingDependency is used after loop opts when Cast nodes' types are widen so Casts that only differ Suggestion: // FloatingNonNarrowingDependency is used after loop opts when Cast nodes' types are widened so Casts that only differ src/hotspot/share/opto/castnode.hpp line 110: > 108: static const DependencyType FloatingNonNarrowingDependency; > 109: static const DependencyType NonFloatingNarrowingDependency; > 110: static const DependencyType NonFloatingNonNarrowingDependency; Why not put the example at each definition? Would prevent repeating the names :) It would be good if we could have this section earlier up, so the code comments of the `DependencyType` class and this form a unit. At least link them. `NonFloatingNonNarrowingDependency` example: can you spell out the why? What could go wrong otherwise? Would the node float back into the loop maybe? What's wrong with that? `NonFloatingNarrowingDependency` more detail would be helpful. I would like to know why non floating, and why narrowing? Because that's what these examples are for, right? `FloatingNonNarrowingDependency` ah, maybe that answers one of my questions further up somewhere. If we don't have narrowing, then we should not fold away the cast because of the type, right? I think if we spell out which optimizations require which constraints, that could help a lot here. src/hotspot/share/opto/castnode.hpp line 122: > 120: ShouldNotReachHere(); > 121: return nullptr; > 122: } This always smells like a messed up class hierarchy, when I see default methods with "not implemented". But maybe we can't do much better, and I've done similar things recently ? . A short code comment could be helpful though. Suggestion: virtual ConstraintCastNode* make_with(Node* parent, const TypeInteger* type, const DependencyType& dependency) const { ShouldNotReachHere(); // Only implemented for CastII and CastLL return nullptr; } src/hotspot/share/opto/castnode.hpp line 146: > 144: virtual uint ideal_reg() const = 0; > 145: bool carry_dependency() const { return !_dependency.cmp(FloatingNarrowingDependency); } > 146: virtual bool depends_only_on_test() const { return _dependency.floating(); } Why not rename it to `is_dependency_floating`? That may be more helpful at the use site. test/hotspot/jtreg/compiler/c2/irTests/TestPushAddThruCast.java line 95: > 93: j += Objects.checkIndex(i - 1, length); > 94: return j; > 95: } Why not add an additional IR rule that checks that there are more casts before they get commoned? Just for completenes ;) ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24575#pullrequestreview-3451986831 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517197209 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517271796 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517301300 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517315011 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517336133 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517344615 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517236142 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517203781 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517366170 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517205971 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517200829 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517251068 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517260839 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517355725 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517299467 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517370224 From epeter at openjdk.org Wed Nov 12 08:33:29 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 12 Nov 2025 08:33:29 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: References: Message-ID: <2RJF9zYoCEnq2riltw2AoWpBYa7T2F7eXEQRTIQJT_w=.f9001c12-2fe9-4432-9aba-d4f0eb59e5dd@github.com> On Wed, 12 Nov 2025 07:24:01 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: >> >> - review >> - infinite loop in gvn fix >> - renaming > > src/hotspot/share/opto/castnode.cpp line 47: > >> 45: Node* ConstraintCastNode::Identity(PhaseGVN* phase) { >> 46: if (!_dependency.narrows_type()) { >> 47: return this; > > Can you please add a code comment? I don't understand it right away :/ Maybe I'm slowly starting to understand... but a code comment would still help a lot here. We are trying to find a dominating cast that has the same or narrower type, and replace with that one. We are only allowed to do that if we have a narrowing cast, because ... > src/hotspot/share/opto/castnode.cpp line 277: > >> 275: >> 276: CastIINode* CastIINode::pin_array_access_node() const { >> 277: assert(depends_only_on_test(), "already pinned"); > > Would this not be more readable? > > Suggestion: > > assert(is_dependency_floating(), "already pinned"); Because it seems we are talking about floating vs pinned here. Adding yet another concept of "depending only on test" would require further explanation / definition. > src/hotspot/share/opto/castnode.cpp line 588: > >> 586: >> 587: // If both inputs are not constant then, with the Cast pushed through the Add/Sub, the cast gets less precised types, >> 588: // and the resulting Add/Sub's type is wider than that of the Cast before pushing. > > I find this long sentence a bit complicated to read. Can you reformulate and maybe break it into smaller sentences? > It would also be good to explicitly say why that may require changing the dependency constraint. I wonder if you renamed `widen_type_dependency` to `with_non_narrowing`, and explained that this now prevents folding away the cast if input types are narrower, etc... that would maybe be more straight forward? I suppose your approach was to just "notify" the dependency that we have widened the type, and then the dependency manages what the implications are. But I find that approach a bit less straight forward, because we are not talking about widening the exact same cast, but a cast that has been pushed through an add/sub. Maybe you can manage to make a coherent argument though, up to you. > src/hotspot/share/opto/castnode.cpp line 625: > >> 623: if (!phase->C->post_loop_opts_phase()) { >> 624: return this_type; >> 625: } > > Honestly, I would prefer to see this "delay to post loop opts" to be done outside of `widen_type`. It would just make more sense there. What do you think? But maybe that is a refactoring for a separate RFE, and then not really worth it. > src/hotspot/share/opto/castnode.hpp line 53: > >> 51: _narrows_type(narrows_type), >> 52: _desc(desc) { >> 53: } > > Could you make the constructor private, and only expose the 4 static fields? That way, nobody comes to the strange idea to construct one of these themselves ;) That would probably require moving the 4 static fields into this class here. Example: `ConstraintCastNode::DependencyType::FloatingNarrowing` Just an idea. Maybe you have a different solution. But a private constructor would be great for sure. > src/hotspot/share/opto/castnode.hpp line 146: > >> 144: virtual uint ideal_reg() const = 0; >> 145: bool carry_dependency() const { return !_dependency.cmp(FloatingNarrowingDependency); } >> 146: virtual bool depends_only_on_test() const { return _dependency.floating(); } > > Why not rename it to `is_dependency_floating`? That may be more helpful at the use site. Otherwise you have to give an explanation/code comment about the concept "depending on test", and define it in terms of floating / non-floating. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517268181 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517304372 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517331973 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517345703 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517217941 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517358981 From epeter at openjdk.org Wed Nov 12 08:33:31 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 12 Nov 2025 08:33:31 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: <2RJF9zYoCEnq2riltw2AoWpBYa7T2F7eXEQRTIQJT_w=.f9001c12-2fe9-4432-9aba-d4f0eb59e5dd@github.com> References: <2RJF9zYoCEnq2riltw2AoWpBYa7T2F7eXEQRTIQJT_w=.f9001c12-2fe9-4432-9aba-d4f0eb59e5dd@github.com> Message-ID: On Wed, 12 Nov 2025 08:19:21 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/castnode.cpp line 625: >> >>> 623: if (!phase->C->post_loop_opts_phase()) { >>> 624: return this_type; >>> 625: } >> >> Honestly, I would prefer to see this "delay to post loop opts" to be done outside of `widen_type`. It would just make more sense there. What do you think? > > But maybe that is a refactoring for a separate RFE, and then not really worth it. But conceptually, we want to say: if we are in post loop opts, then widen the types. Now it looks like we want to widen always ... but then we check for post loop opts inside the method and bail out anyway. Not very transparent. Another idea: rename the method to `widen_type_in_post_loop_opts`. Totally up to you though. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2517350982 From kbarrett at openjdk.org Wed Nov 12 08:39:04 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 12 Nov 2025 08:39:04 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions In-Reply-To: <6SO4sjrZNkftwWmo-9j7fPvB5K7e24kpCqt5UzSWWqQ=.bbb0560f-2e54-4f68-a67f-10fa35bf0afb@github.com> References: <6SO4sjrZNkftwWmo-9j7fPvB5K7e24kpCqt5UzSWWqQ=.bbb0560f-2e54-4f68-a67f-10fa35bf0afb@github.com> Message-ID: On Wed, 12 Nov 2025 07:38:53 GMT, Stefan Karlsson wrote: >> 8369187: Add wrapper for that forbids use of global allocation and deallocation functions >> >> Please review this change that adds `cppstdlib/new.hpp` as a wrapper for >> including ``. All existing inclusions of `` are changed to include >> the new wrapper. >> >> In additional to including ``, this wrapper also provides deprecation >> declarations to prevent the use of some facilities by HotSpot code. >> >> However, those deprecations need to be conditionalized to not apply to gtests, >> so this change also adds a macro definition provided by the build system for >> use in detecting that a header is being included by a gtest. >> >> Testing: mach5 tier1 > > src/hotspot/share/cppstdlib/new.hpp line 79: > >> 77: // Visual Studio => error C2370: '...': redefinition; different storage class >> 78: #ifndef TARGET_COMPILER_visCPP >> 79: [[deprecated]] extern const size_t hardware_destructive_interference_size; > > At cppreference this is declared as: > > inline constexpr size_t hardware_destructive_interference_size > > > Is that why you're getting the Visual Studio error? It can't be redeclared with the `[[deprecated]]` attribute using that form. `constexpr` requires an initializer, and what should the value be? And all `inline` declarations need to be "exactly the same" (which has a technical meaning somewhere that talks about equivalent token sequences). Removing `extern`, adding `inline`, or both leads to gcc to (quite correctly, I think) rejecting it as a redefinition. I think the form being used here does have the same storage class. I think both forms declare a variable with namespace scope and external linkage; C++17 6.5. And both gcc and clang accept it. I _think_ it's an MSVC bug of being overly restrictive, rather than both gcc and clang being overly permissive. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28250#discussion_r2517393464 From azafari at openjdk.org Wed Nov 12 09:18:48 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 12 Nov 2025 09:18:48 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v8] In-Reply-To: References: Message-ID: > Avoid using loop and UB in left-shift operation as suggested by Kim's comment in the JBS-issue. > > Tests: > mach5 tiers 1-5 {macosx-aarch64, linux-x64, windows-x64} x {debug, product} Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: disabling the msvc warning remoeved ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27288/files - new: https://git.openjdk.org/jdk/pull/27288/files/b00636bf..59b20a20 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27288&range=06-07 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27288.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27288/head:pull/27288 PR: https://git.openjdk.org/jdk/pull/27288 From azafari at openjdk.org Wed Nov 12 09:18:50 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 12 Nov 2025 09:18:50 GMT Subject: RFR: 8358957: [ubsan]: The assert in layout_helper_boolean_diffbit() in klass.hpp needs UB to fail [v7] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 04:10:28 GMT, Kim Barrett wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> Windows warning bypassed > > src/hotspot/share/oops/klass.hpp line 515: > >> 513: >> 514: // VS warns (C4146) about unary minus of unsigned. >> 515: PRAGMA_DISABLE_MSVC_WARNING(4146) > > It seems that some time ago (JDK-8254072) 4146 was disabled for JVM MSVC build: > https://github.com/openjdk/jdk/blame/8531fa146be1da5e96c0f23091882a27c67d7893/make/hotspot/lib/CompileJvm.gmk#L117 > > So the warning isn't a concern after all. Sorry for the false alarm and > misleading guidance. > > This was also not the correct way to introduce the warning suppression had it > been needed, but that's moot. Thanks for the comment. You're right, I used the function scope of the pragma instead of the instruction scope. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27288#discussion_r2517522264 From eastigeevich at openjdk.org Wed Nov 12 09:38:03 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 12 Nov 2025 09:38:03 GMT Subject: RFR: 8371649: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 21:35:42 GMT, Evgeny Astigeevich wrote: > The instruction cache maintenance function internally handles any required barriers. > This means we don't need any barriers before calling it. > This PR removes a redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation. Hi Andrew(@theRealAph), Can you please take a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3520985271 From cnorrbin at openjdk.org Wed Nov 12 09:50:37 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Wed, 12 Nov 2025 09:50:37 GMT Subject: RFR: 8367319: Add os interfaces to get machine and container values separately [v3] In-Reply-To: References: Message-ID: <90BsIFGnC7wfP7mO7kOcDArByL17pNbTokjZiTs_7qQ=.e67dbb82-faf4-4364-9301-67e1e2344eb0@github.com> > Hi everyone, > > The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples: > > - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different. > - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number. > > To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values. > > In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment. > > `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`. > > Testing: > - Oracle tiers 1-5 > - Container tests on cgroup v1 and v2 hosts. Casper Norrbin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Move methods to Machine/Container inner classes + clarifying documentation - Merge branch 'master' into separate-container-machine-values - Fixed print type - separate-machine-container-functions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27646/files - new: https://git.openjdk.org/jdk/pull/27646/files/e59ff7c4..2cc54357 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27646&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27646&range=01-02 Stats: 283171 lines in 2871 files changed: 181285 ins; 61539 del; 40347 mod Patch: https://git.openjdk.org/jdk/pull/27646.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27646/head:pull/27646 PR: https://git.openjdk.org/jdk/pull/27646 From cnorrbin at openjdk.org Wed Nov 12 09:58:12 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Wed, 12 Nov 2025 09:58:12 GMT Subject: RFR: 8367319: Add os interfaces to get machine and container values separately [v3] In-Reply-To: <90BsIFGnC7wfP7mO7kOcDArByL17pNbTokjZiTs_7qQ=.e67dbb82-faf4-4364-9301-67e1e2344eb0@github.com> References: <90BsIFGnC7wfP7mO7kOcDArByL17pNbTokjZiTs_7qQ=.e67dbb82-faf4-4364-9301-67e1e2344eb0@github.com> Message-ID: On Wed, 12 Nov 2025 09:50:37 GMT, Casper Norrbin wrote: >> Hi everyone, >> >> The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples: >> >> - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different. >> - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number. >> >> To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values. >> >> In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment. >> >> `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`. >> >> Testing: >> - Oracle tiers 1-5 >> - Container tests on cgroup v1 and v2 hosts. > > Casper Norrbin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Move methods to Machine/Container inner classes + clarifying documentation > - Merge branch 'master' into separate-container-machine-values > - Fixed print type > - separate-machine-container-functions I want to thank everyone for the great input and thoughtful discussion. I really appreciate the depth of feedback. I've moved ahead with inner `os::Machine` and `os::Container` classes. This keeps the `os::` methods unchanged for anyone who doesn't care about the source, but makes it much clearer how to get either the OS/system view or the container's limits when you need that distinction. Of course, there's no perfect split here. Plenty of ambiguity remains when it comes to virtualization, affinity masks, and what we call "the machine". I have added comments and documentation try to spell out exactly what these classes represent, what each method reports, what's affected by containers and what isn't, and the best context for using each one. For the moment, this only targets cgroup-based container environments, but I'm open to revisiting things if other environments gain traction in the future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27646#issuecomment-3521064517 From jsjolen at openjdk.org Wed Nov 12 10:00:03 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 12 Nov 2025 10:00:03 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM [v2] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 08:00:30 GMT, Yasumasa Suenaga wrote: >> When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) >> >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [linux-vdso.so.1+0xe69] >> [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] >> >> Retrying call stack printing without source information... >> >> [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] >> >> >> When I checked back trace on GDB, it failed at `assert`. >> >> #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", >> line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", >> detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 >> #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 >> #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 >> >> >> >> (gdb) f 13 >> #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 >> 536 assert(false, "section header string table should be loaded"); >> >> >> vDSO is not a regular ELF, so it should be skipped here. > > Yasumasa Suenaga has updated the pull request incrementally with two additional commits since the last revision: > > - Undo unnecessary change > - Check the result of opening file LGTM! Thanks ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28102#pullrequestreview-3452582087 From phubner at openjdk.org Wed Nov 12 10:03:07 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Wed, 12 Nov 2025 10:03:07 GMT Subject: RFR: 8371093: Assert "section header string table should be loaded" failed on debug VM [v2] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 08:00:30 GMT, Yasumasa Suenaga wrote: >> When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) >> >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> C [linux-vdso.so.1+0xe69] >> [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] >> >> Retrying call stack printing without source information... >> >> [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] >> >> >> When I checked back trace on GDB, it failed at `assert`. >> >> #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", >> line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", >> detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 >> #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 >> #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 >> >> >> >> (gdb) f 13 >> #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) >> at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 >> 536 assert(false, "section header string table should be loaded"); >> >> >> vDSO is not a regular ELF, so it should be skipped here. > > Yasumasa Suenaga has updated the pull request incrementally with two additional commits since the last revision: > > - Undo unnecessary change > - Check the result of opening file Thanks for looking into this! ------------- Marked as reviewed by phubner (Author). PR Review: https://git.openjdk.org/jdk/pull/28102#pullrequestreview-3452598612 From aph at openjdk.org Wed Nov 12 10:36:03 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 12 Nov 2025 10:36:03 GMT Subject: RFR: 8371649: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 21:35:42 GMT, Evgeny Astigeevich wrote: > The instruction cache maintenance function internally handles any required barriers. > This means we don't need any barriers before calling it. > This PR removes a redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation. Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28244#pullrequestreview-3452753678 From qamai at openjdk.org Wed Nov 12 10:37:11 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Wed, 12 Nov 2025 10:37:11 GMT Subject: RFR: 8367341: C2: apply KnownBits and unsigned bounds to And / Or operations [v5] In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 10:09:07 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR improves the implementation of `AndNode/OrNode/XorNode::Value` by taking advantages of the additional information in `TypeInt`. The implementation is pretty straightforward. A clever trick is that by analyzing the negative and positive ranges of a `TypeInt` separately, we have better info for the leading bits. I also implement gtest unit tests to verify the correctness and monotonicity of the inference functions. >> >> Please take a look and leave your reviews, thanks a lot. > > Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: > > - Merge branch 'master' into andorxor > - Add assertion for the helper in CTPComparator > > Co-authored-by: Emanuel Peter > - remove std::hash > - remove unordered_map, add some comments for all_instances_size > - Emanuel's reviews > - Improve Value inferences of And, Or, Xor and implement gtest for general Value inferences May I have a second review, please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27618#issuecomment-3521239764 From eosterlund at openjdk.org Wed Nov 12 10:49:10 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 12 Nov 2025 10:49:10 GMT Subject: RFR: 8367319: Add os interfaces to get machine and container values separately [v3] In-Reply-To: <90BsIFGnC7wfP7mO7kOcDArByL17pNbTokjZiTs_7qQ=.e67dbb82-faf4-4364-9301-67e1e2344eb0@github.com> References: <90BsIFGnC7wfP7mO7kOcDArByL17pNbTokjZiTs_7qQ=.e67dbb82-faf4-4364-9301-67e1e2344eb0@github.com> Message-ID: On Wed, 12 Nov 2025 09:50:37 GMT, Casper Norrbin wrote: >> Hi everyone, >> >> The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples: >> >> - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different. >> - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number. >> >> To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values. >> >> In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment. >> >> `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`. >> >> Testing: >> - Oracle tiers 1-5 >> - Container tests on cgroup v1 and v2 hosts. > > Casper Norrbin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Move methods to Machine/Container inner classes + clarifying documentation > - Merge branch 'master' into separate-container-machine-values > - Fixed print type > - separate-machine-container-functions Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27646#pullrequestreview-3452815985 From ayang at openjdk.org Wed Nov 12 11:04:21 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 12 Nov 2025 11:04:21 GMT Subject: RFR: 8370333: hotspot-unit-tests.md specifies wrong directory structure for tests In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 06:14:47 GMT, Kim Barrett wrote: > Please review this change to the HotSpot unit test documentation, fixing the > path where native tests are located. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28251#pullrequestreview-3452874281 From duke at openjdk.org Wed Nov 12 11:09:45 2025 From: duke at openjdk.org (Samuel Chee) Date: Wed, 12 Nov 2025 11:09:45 GMT Subject: RFR: 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile field loads [v3] In-Reply-To: References: Message-ID: > Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR for volatile field loads - for example, AtomicLong::get. > > This is valid, as originally the DMBs were necessary due to the case described here - https://bugs.openjdk.org/browse/JDK-8179954. As in the rare case where the LD can be reordered with an LDAR or STLR from the C2 implementation for stores and loads, these DMBs are required. > However, acquire/release operations use a sequentially consistent model which does not allow reordering between them. Hence, the LD can be replaced with an LDAR to disallow reordering with a STLR/LDAR and the first DMB can be removed. > > The LDAR has acquire semantics, so it's impossible for memory accesses after to be reordered before; the DMB ISHLD is not required. Therefore, a singular LDAR is sufficient. Samuel Chee has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Address review comments. Refine. Change-Id: I9cc0308300548c1892d39791e00b41ef13c95e63 - Merge from the main branch - Address review comments Change-Id: Ica13be8094ac0f057066042ef0a5ec5927b98dfd - Refine code generation for mem2reg_volatile The patch is contributed by @theRealAph. Change-Id: I7ab1854dd238cdce72a4ab218b5b4ee84ad39586 - 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile loads Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR for volatile field loads - for example, AtomicLong::get. This is valid, as originally the DMBs were necessary due to the case described here - https://bugs.openjdk.org/browse/JDK-8179954. As in the rare case where the LD can be reordered with an LDAR or STLR from the C2 implementation for stores and loads, these DMBs are required. However, acquire/release operations use a sequentially consistent model which does not allow reordering between them. Hence, the LD can be replaced with an LDAR to disallow reordering with a STLR/LDAR and the first DMB can be removed. The LDAR has acquire semantics, so it's impossible for memory accesses after to be reordered before; the DMB ISHLD is not required. Therefore, a singular LDAR is sufficient. This excludes floats and doubles, as they do not have equivalent load-acquire instructions. Change-Id: Ia93607f8bb20c2d974fe6b2e586dd3239bb2728c ------------- Changes: https://git.openjdk.org/jdk/pull/26748/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26748&range=02 Stats: 111 lines in 11 files changed: 81 ins; 11 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/26748.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26748/head:pull/26748 PR: https://git.openjdk.org/jdk/pull/26748 From qamai at openjdk.org Wed Nov 12 11:16:51 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Wed, 12 Nov 2025 11:16:51 GMT Subject: RFR: 8367341: C2: apply KnownBits and unsigned bounds to And / Or operations [v6] In-Reply-To: References: Message-ID: > Hi, > > This PR improves the implementation of `AndNode/OrNode/XorNode::Value` by taking advantages of the additional information in `TypeInt`. The implementation is pretty straightforward. A clever trick is that by analyzing the negative and positive ranges of a `TypeInt` separately, we have better info for the leading bits. I also implement gtest unit tests to verify the correctness and monotonicity of the inference functions. > > Please take a look and leave your reviews, thanks a lot. Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into andorxor - Merge branch 'master' into andorxor - Add assertion for the helper in CTPComparator Co-authored-by: Emanuel Peter - remove std::hash - remove unordered_map, add some comments for all_instances_size - Emanuel's reviews - Improve Value inferences of And, Or, Xor and implement gtest for general Value inferences ------------- Changes: https://git.openjdk.org/jdk/pull/27618/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27618&range=05 Stats: 964 lines in 9 files changed: 630 ins; 313 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/27618.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27618/head:pull/27618 PR: https://git.openjdk.org/jdk/pull/27618 From eastigeevich at openjdk.org Wed Nov 12 11:25:05 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 12 Nov 2025 11:25:05 GMT Subject: RFR: 8371649: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: <7CsmmJSr0mHGDgPJXBm3pRHbCBpXyeCtHTKJ1-s3dyI=.6329cb7f-dbb0-4006-939b-2484e7196ac5@github.com> On Wed, 12 Nov 2025 10:33:10 GMT, Andrew Haley wrote: >> The instruction cache maintenance function internally handles any required barriers. >> This means we don't need any barriers before calling it. >> This PR removes a redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation. > > Marked as reviewed by aph (Reviewer). Thank you, @theRealAph JFYI, where it was found: https://bugs.openjdk.org/browse/JDK-8370947 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3521440299 From aph at openjdk.org Wed Nov 12 11:30:14 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 12 Nov 2025 11:30:14 GMT Subject: RFR: 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile field loads [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 11:09:45 GMT, Samuel Chee wrote: >> Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR for volatile field loads - for example, AtomicLong::get. >> >> This is valid, as originally the DMBs were necessary due to the case described here - https://bugs.openjdk.org/browse/JDK-8179954. As in the rare case where the LD can be reordered with an LDAR or STLR from the C2 implementation for stores and loads, these DMBs are required. >> However, acquire/release operations use a sequentially consistent model which does not allow reordering between them. Hence, the LD can be replaced with an LDAR to disallow reordering with a STLR/LDAR and the first DMB can be removed. >> >> The LDAR has acquire semantics, so it's impossible for memory accesses after to be reordered before; the DMB ISHLD is not required. Therefore, a singular LDAR is sufficient. > > Samuel Chee has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Address review comments. Refine. > > Change-Id: I9cc0308300548c1892d39791e00b41ef13c95e63 > - Merge from the main branch > - Address review comments > > Change-Id: Ica13be8094ac0f057066042ef0a5ec5927b98dfd > - Refine code generation for mem2reg_volatile > > The patch is contributed by @theRealAph. > > Change-Id: I7ab1854dd238cdce72a4ab218b5b4ee84ad39586 > - 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile loads > > Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR > for volatile field loads - for example, AtomicLong::get. > > This is valid, as originally the DMBs were necessary due to > the case described here - https://bugs.openjdk.org/browse/JDK-8179954. > As in the rare case where the LD can be reordered with an LDAR > or STLR from the C2 implementation for stores and loads, these > DMBs are required. > However, acquire/release operations use a sequentially consistent model > which does not allow reordering between them. Hence, the LD can be > replaced with an LDAR to disallow reordering with a STLR/LDAR > and the first DMB can be removed. > > The LDAR has acquire semantics, so it's impossible for > memory accesses after to be reordered before; the DMB ISHLD is > not required. Therefore, a singular LDAR is sufficient. > > This excludes floats and doubles, as they do not have > equivalent load-acquire instructions. > > Change-Id: Ia93607f8bb20c2d974fe6b2e586dd3239bb2728c src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp line 948: > 946: } > 947: > 948: void LIR_Assembler::load_generic(LIR_Address *from_addr, LIR_Opr dest, Suggestion: void LIR_Assembler::load_relaxed(LIR_Address *from_addr, LIR_Opr dest, Standard terminology. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26748#discussion_r2517938298 From duke at openjdk.org Wed Nov 12 11:33:10 2025 From: duke at openjdk.org (Ruben) Date: Wed, 12 Nov 2025 11:33:10 GMT Subject: RFR: 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile field loads [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 11:09:45 GMT, Samuel Chee wrote: >> Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR for volatile field loads - for example, AtomicLong::get. >> >> This is valid, as originally the DMBs were necessary due to the case described here - https://bugs.openjdk.org/browse/JDK-8179954. As in the rare case where the LD can be reordered with an LDAR or STLR from the C2 implementation for stores and loads, these DMBs are required. >> However, acquire/release operations use a sequentially consistent model which does not allow reordering between them. Hence, the LD can be replaced with an LDAR to disallow reordering with a STLR/LDAR and the first DMB can be removed. >> >> The LDAR has acquire semantics, so it's impossible for memory accesses after to be reordered before; the DMB ISHLD is not required. Therefore, a singular LDAR is sufficient. > > Samuel Chee has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Address review comments. Refine. > > Change-Id: I9cc0308300548c1892d39791e00b41ef13c95e63 > - Merge from the main branch > - Address review comments > > Change-Id: Ica13be8094ac0f057066042ef0a5ec5927b98dfd > - Refine code generation for mem2reg_volatile > > The patch is contributed by @theRealAph. > > Change-Id: I7ab1854dd238cdce72a4ab218b5b4ee84ad39586 > - 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile loads > > Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR > for volatile field loads - for example, AtomicLong::get. > > This is valid, as originally the DMBs were necessary due to > the case described here - https://bugs.openjdk.org/browse/JDK-8179954. > As in the rare case where the LD can be reordered with an LDAR > or STLR from the C2 implementation for stores and loads, these > DMBs are required. > However, acquire/release operations use a sequentially consistent model > which does not allow reordering between them. Hence, the LD can be > replaced with an LDAR to disallow reordering with a STLR/LDAR > and the first DMB can be removed. > > The LDAR has acquire semantics, so it's impossible for > memory accesses after to be reordered before; the DMB ISHLD is > not required. Therefore, a singular LDAR is sufficient. > > This excludes floats and doubles, as they do not have > equivalent load-acquire instructions. > > Change-Id: Ia93607f8bb20c2d974fe6b2e586dd3239bb2728c I've run `java -jar jcstress.jar` (revision 1d143cbd430f4cca63a8f0c8c1fad3aabc065421) for this PR combined with the https://github.com/openjdk/jdk/pull/26000 - with `-UseLSE` and `+UseLSE` with these outcomes respectively: - ``` Failed tests: No matches. Error tests: No matches. All remaining tests: 4945 matching test results. ``` - ``` Failed tests: No matches. Error tests: No matches. All remaining tests: 4955 matching test results. ``` ------------- PR Comment: https://git.openjdk.org/jdk/pull/26748#issuecomment-3521467297 From vklang at openjdk.org Wed Nov 12 12:54:34 2025 From: vklang at openjdk.org (Viktor Klang) Date: Wed, 12 Nov 2025 12:54:34 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v11] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:48:43 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: > > - Remove dup end body tag > - Change FinalFieldMutationEvent so that caller is top frame in stack trace > - Merge branch 'master' into JDK-8353835 > - Review feedback: Add tests for setting internal properties, improve links in Mutation methods page > - Merge branch 'master' into JDK-8353835 > - Merge branch 'master' into JDK-8353835 > - Fix typo in test comment > - Merge branch 'master' into JDK-8353835 > - Merge branch 'master' into JDK-8353835 > - Suppress warnings from some tests > - ... and 40 more: https://git.openjdk.org/jdk/compare/2902436f...b22947c7 src/java.base/share/classes/java/lang/reflect/Field.java line 1543: > 1541: * the given possibly-null caller. > 1542: */ > 1543: private String finalFieldMutationWarning(Class caller, boolean unreflect) { It may make sense to have this method return a StringBuilder instance (and use it internally before returning it) as that would cut down on extra String-instance creation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2518192839 From stefank at openjdk.org Wed Nov 12 13:16:19 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 12 Nov 2025 13:16:19 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions In-Reply-To: References: <6SO4sjrZNkftwWmo-9j7fPvB5K7e24kpCqt5UzSWWqQ=.bbb0560f-2e54-4f68-a67f-10fa35bf0afb@github.com> Message-ID: On Wed, 12 Nov 2025 08:36:32 GMT, Kim Barrett wrote: >> src/hotspot/share/cppstdlib/new.hpp line 79: >> >>> 77: // Visual Studio => error C2370: '...': redefinition; different storage class >>> 78: #ifndef TARGET_COMPILER_visCPP >>> 79: [[deprecated]] extern const size_t hardware_destructive_interference_size; >> >> At cppreference this is declared as: >> >> inline constexpr size_t hardware_destructive_interference_size >> >> >> Is that why you're getting the Visual Studio error? > > It can't be redeclared with the `[[deprecated]]` attribute using that form. > `constexpr` requires an initializer, and what should the value be? And all > `inline` declarations need to be "exactly the same" (which has a technical > meaning somewhere that talks about equivalent token sequences). > > Removing `extern`, adding `inline`, or both leads to gcc to (quite correctly, > I think) rejecting it as a redefinition. > > I think the form being used here does have the same storage class. I think > both forms declare a variable with namespace scope and external linkage; C++17 > 6.5. And both gcc and clang accept it. I _think_ it's an MSVC bug of being > overly restrictive, rather than both gcc and clang being overly permissive. I'm still note convinced that the above does what you intended to do here, but the result seems to be the same so I guess that's fine. If I compile a small test: #include [[deprecated]] inline constexpr int my_deprecated = 1; int main() { printf("my_deprecated: %d\n", my_deprecated); return 0; } I get: $ g++ -std=c++17 -Wall -Wextra -pedantic h.cpp h.cpp:6:33: warning: 'my_deprecated' is deprecated [-Wdeprecated-declarations] 6 | printf("my_deprecated: %d\n", my_deprecated); | ^ h.cpp:3:3: note: 'my_deprecated' has been explicitly marked deprecated here 3 | [[deprecated]] inline constexpr int my_deprecated = 1; | ^ 1 warning generated. But if I try to compile something similar to the above code I don't get the deprecated warning: #include #include namespace std { [[deprecated]] extern const size_t hardware_destructive_interference_size; }; int main() { printf("x = %zu\n", std::hardware_destructive_interference_size); return 0; } Now I get: $ g++ -std=c++17 -Wall -Wextra -pedantic g.cpp g.cpp:9:28: error: reference to 'hardware_destructive_interference_size' is ambiguous 9 | printf("x = %zu\n", std::hardware_destructive_interference_size); | ~~~~~^ /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__new/interference_size.h:25:25: note: candidate found by name lookup is 'std::__1::hardware_destructive_interference_size' 25 | inline constexpr size_t hardware_destructive_interference_size = __GCC_DESTRUCTIVE_SIZE; | ^ g.cpp:5:36: note: candidate found by name lookup is 'std::hardware_destructive_interference_size' 5 | [[deprecated]] extern const size_t hardware_destructive_interference_size; | ^ 1 error generated. where the end result is that we get a warning about an ambiguous lookup and not a warning that this has been deprecated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28250#discussion_r2518263890 From aph at openjdk.org Wed Nov 12 14:06:28 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 12 Nov 2025 14:06:28 GMT Subject: RFR: 8371649: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 10:33:10 GMT, Andrew Haley wrote: >> The instruction cache maintenance function internally handles any required barriers. >> This means we don't need any barriers before calling it. >> This PR removes a redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation. > > Marked as reviewed by aph (Reviewer). > Thank you, @theRealAph JFYI, where it was found: https://bugs.openjdk.org/browse/JDK-8370947 That's fascinating, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3522109025 From aph at openjdk.org Wed Nov 12 14:22:36 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 12 Nov 2025 14:22:36 GMT Subject: RFR: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family [v2] In-Reply-To: References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: On Tue, 11 Nov 2025 18:11:20 GMT, Dhamoder Nalla wrote: >> This PR makes two targeted AArch64 updates specific to Qualcomm silicon: >> >> 1. Corrects the CPU family enum name typo from CPU_QUALCOM to CPU_QUALCOMM. >> 2. Enables UseSHA3Intrinsics for Qualcomm (CPU_QUALCOMM) in addition to Apple (CPU_APPLE), allowing Qualcomm-based systems to use hardware-optimized SHA?3 implementations. >> >> Performance testing: >> The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs. >> >> >> > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> >> >> >> >> >> >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement >> -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- >> MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% >> MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% >> MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% >> MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% >> MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% >> MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% >> MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% >> MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66% >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> >> >> >> > > Dhamoder Nalla has updated the pull request incrementally with two additional commits since the last revision: > > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28166#pullrequestreview-3453693100 From vklang at openjdk.org Wed Nov 12 14:35:54 2025 From: vklang at openjdk.org (Viktor Klang) Date: Wed, 12 Nov 2025 14:35:54 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v11] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:48:43 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: > > - Remove dup end body tag > - Change FinalFieldMutationEvent so that caller is top frame in stack trace > - Merge branch 'master' into JDK-8353835 > - Review feedback: Add tests for setting internal properties, improve links in Mutation methods page > - Merge branch 'master' into JDK-8353835 > - Merge branch 'master' into JDK-8353835 > - Fix typo in test comment > - Merge branch 'master' into JDK-8353835 > - Merge branch 'master' into JDK-8353835 > - Suppress warnings from some tests > - ... and 40 more: https://git.openjdk.org/jdk/compare/2902436f...b22947c7 src/java.base/share/man/java.md line 482: > 480: > 481: - `allow`: This mode allows illegal final field mutation in all modules, > 482: without any warings. Suggestion: without any warnings. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2518551065 From shade at openjdk.org Wed Nov 12 15:07:30 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Nov 2025 15:07:30 GMT Subject: RFR: 8371709: Add CTW to hotspot_compiler testing Message-ID: CTW tests are for compiler testing, so it makes sense to run them as part of hotspot_compiler group. There are no external dependencies for CTW that processes JDK-s own modules, so we can add that. ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/28268/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28268&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371709 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28268.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28268/head:pull/28268 PR: https://git.openjdk.org/jdk/pull/28268 From duke at openjdk.org Wed Nov 12 15:23:49 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 12 Nov 2025 15:23:49 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease Message-ID: Hi, please consider the following changes: In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a functor by only one thread. Tested in tiers 1 - 5. ------------- Commit messages: - Merge remote-tracking branch 'origin/master' into JDK-8366671-refactor-spin-acquire-spin-release - 8366671: Fixed whitespaces. - 8366671: Fixed whitespaces. - 8366671: Fixed whitespaces. - 8366671: Refactor SpinAcquire/SpinRelease into SpinCriticalSection Changes: https://git.openjdk.org/jdk/pull/28264/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8366671 Stats: 380 lines in 12 files changed: 199 ins; 116 del; 65 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From kvn at openjdk.org Wed Nov 12 15:24:30 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 12 Nov 2025 15:24:30 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: <_iaflU1ISP1m0TQKu8cUvnyQ6NQ92xs4tI1Mqs6H3E8=.9e55e39f-b852-4604-acb4-ec9f3dedd014@github.com> On Tue, 11 Nov 2025 17:50:43 GMT, Albert Mingkun Yang wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review I ran our testing (tier1-5) and there are no new failures. So I approve this change. But you need second review. Preferable from other platforms supporters (RISC-V, PPC64, s390) ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28240#pullrequestreview-3454014494 From haosun at openjdk.org Wed Nov 12 16:10:27 2025 From: haosun at openjdk.org (Hao Sun) Date: Wed, 12 Nov 2025 16:10:27 GMT Subject: RFR: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family [v2] In-Reply-To: References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: On Tue, 11 Nov 2025 18:11:20 GMT, Dhamoder Nalla wrote: >> This PR makes two targeted AArch64 updates specific to Qualcomm silicon: >> >> 1. Corrects the CPU family enum name typo from CPU_QUALCOM to CPU_QUALCOMM. >> 2. Enables UseSHA3Intrinsics for Qualcomm (CPU_QUALCOMM) in addition to Apple (CPU_APPLE), allowing Qualcomm-based systems to use hardware-optimized SHA?3 implementations. >> >> Performance testing: >> The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs. >> >> >> > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> >> >> >> >> >> >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement >> -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- >> MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% >> MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% >> MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% >> MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% >> MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% >> MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% >> MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% >> MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66% >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> >> >> >> > > Dhamoder Nalla has updated the pull request incrementally with two additional commits since the last revision: > > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family LGTM. And I thought we may need update `test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java` as well because this intrinsic is enabled on Qualcomm silicon with this patch. However, I don't think there is an easy way to select Qualcomm silicon for this test case. ------------- Marked as reviewed by haosun (Committer). PR Review: https://git.openjdk.org/jdk/pull/28166#pullrequestreview-3454256539 From eastigeevich at openjdk.org Wed Nov 12 16:55:40 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 12 Nov 2025 16:55:40 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 21:35:42 GMT, Evgeny Astigeevich wrote: > The instruction cache maintenance function internally handles any required barriers. > This means we don't need any barriers before calling it. > This PR removes a redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation. Hi Erik (@fisk), Could you also please take a look, just in case the fence was intentionally put there? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3522921463 From dhanalla at openjdk.org Wed Nov 12 17:50:15 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Wed, 12 Nov 2025 17:50:15 GMT Subject: RFR: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family [v2] In-Reply-To: References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: <46H-NumX52iA27HL4AaYTi1FSnF2dI6gN8qXmjs0XZ4=.0ec27488-afc7-424d-9610-3a79b54ac93a@github.com> On Wed, 12 Nov 2025 16:07:54 GMT, Hao Sun wrote: > LGTM. And I thought we may need update `test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java` as well because this intrinsic is enabled on Qualcomm silicon with this patch. However, I don't think there is an easy way to select Qualcomm silicon for this test case. Thanks @shqking. Yes, currently there's no straightforward way to conditionally run this test based on CPU vendor type. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28166#issuecomment-3523145634 From shade at openjdk.org Wed Nov 12 18:39:39 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Nov 2025 18:39:39 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 22:50:44 GMT, Ivan wrote: >> src/hotspot/cpu/arm/register_arm.hpp line 187: >> >>> 185: enum { >>> 186: number_of_registers = NOT_COMPILER2(32) COMPILER2_PRESENT(64), >>> 187: max_slots_per_register = 1 >> >> Can you double-check it is really `1`? For GPRs, we have `max_slots_per_register` at effectively `2`. > > The change is consistent with previous version. Previously the value was calculated in `src/hotspot/cpu/arm/register_arm.hpp` inside ConcreteRegisterImpl class like this: > > #ifdef COMPILER2 > log_bytes_per_fpr = 2, // quad vectors > #else > log_bytes_per_fpr = 2, // double vectors > #endif > ... > log_vmregs_per_fpr = log_bytes_per_fpr - LogBytesPerInt, > ... > vmregs_per_fpr = 1 << log_vmregs_per_fpr, > > `LogBytesPerInt` in `globalDefinitions.hpp` is 2, so the `vmregs_per_fpr` always evaluated to 1. OK, thanks. >> src/hotspot/cpu/arm/register_arm.hpp line 264: >> >>> 262: constexpr FloatRegister S4_reg = as_FloatRegister(4); >>> 263: constexpr FloatRegister S5_reg = as_FloatRegister(5); >>> 264: constexpr FloatRegister S6_reg = as_FloatRegister(6); >> >> Take a chance on renaming these `S${X}_reg` to just `S${X}`? I spot-checked their usages, and there are only a few places that need adjustments. > > At the top of these definitions there is a comment > > /* > * S1-S6 are named with "_reg" suffix to avoid conflict with > * constants defined in sharedRuntimeTrig.cpp > */ > ``` > And the definitions from `sharedRuntimeTrig.cpp` are still there > ``` > static const double > S1 = -1.66666666666666324348e-01, /* 0xBFC55555, 0x55555549 */ > S2 = 8.33333333332248946124e-03, /* 0x3F811111, 0x1110F8A6 */ > S3 = -1.98412698298579493134e-04, /* 0xBF2A01A0, 0x19C161D5 */ > S4 = 2.75573137070700676789e-06, /* 0x3EC71DE3, 0x57B1FE7D */ > S5 = -2.50507602534068634195e-08, /* 0xBE5AE5E6, 0x8A2B9CEB */ > S6 = 1.58969099521155010221e-10; /* 0x3DE5D93A, 0x5ACFD57C */ > ``` > > Should I ignore it and change `S${X}_reg` to `S${X}`? Or I shall avoid the conflict some other way? Ooooof. Looks weird that the _common_ register code has to yield to `sharedRuntimeTrig.cpp`! But yeah, if it is inconvenient to do here, file a separate cleanup for it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2519388128 PR Review Comment: https://git.openjdk.org/jdk/pull/26525#discussion_r2519386467 From shade at openjdk.org Wed Nov 12 18:46:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Nov 2025 18:46:16 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 22:09:24 GMT, Ivan wrote: >> Migrate away from pointer-based representation of Register values. >> >> It improves compile-time checking by forbidding implicit conversions between integrals and pointers. >> >> [JDK-8363943](https://bugs.openjdk.org/browse/JDK-8363943) > > Ivan has updated the pull request incrementally with one additional commit since the last revision: > > Proposed review changes were applied Looks fine to me, but ultimately @bulasevich should make the call. What kind of testing did you do? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26525#pullrequestreview-3454934124 From duke at openjdk.org Wed Nov 12 18:59:45 2025 From: duke at openjdk.org (duke) Date: Wed, 12 Nov 2025 18:59:45 GMT Subject: RFR: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family [v2] In-Reply-To: References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: On Tue, 11 Nov 2025 18:11:20 GMT, Dhamoder Nalla wrote: >> This PR makes two targeted AArch64 updates specific to Qualcomm silicon: >> >> 1. Corrects the CPU family enum name typo from CPU_QUALCOM to CPU_QUALCOMM. >> 2. Enables UseSHA3Intrinsics for Qualcomm (CPU_QUALCOMM) in addition to Apple (CPU_APPLE), allowing Qualcomm-based systems to use hardware-optimized SHA?3 implementations. >> >> Performance testing: >> The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs. >> >> >> > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> >> >> >> >> >> >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement >> -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- >> MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% >> MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% >> MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% >> MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% >> MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% >> MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% >> MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% >> MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66% >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> >> >> >> > > Dhamoder Nalla has updated the pull request incrementally with two additional commits since the last revision: > > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family @dhanalla Your change (at version 076f1d608fd49c43fa827189987478865a6f97da) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28166#issuecomment-3523451085 From psandoz at openjdk.org Wed Nov 12 19:51:18 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Wed, 12 Nov 2025 19:51:18 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 07:59:38 GMT, Jatin Bhateja wrote: > > > Some quick comments. > > > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. > > > > > > I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. > > There are nomenclature issues that I am facing. Currently, all the Float16 concrete classes use the Halffloat prefix i.e., Halffloat64Vector, Halffloat128Vector; converting these to Float16 looks a little confusing, i.e., Float1664Vector, Float16128Vector, etc Kindly suggest a better name to represent these classes. Maybe we move the shape to the end e.g., `Float16Vector128`, `IntVector128`, `IntVectorMax`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3523631727 From psandoz at openjdk.org Wed Nov 12 20:13:49 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Wed, 12 Nov 2025 20:13:49 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 19:48:54 GMT, Paul Sandoz wrote: >>> > Some quick comments. >>> > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. >>> >>> I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. >> >> There are nomenclature issues that I am facing. Currently, all the Float16 concrete classes use the Halffloat prefix i.e., Halffloat64Vector, Halffloat128Vector; converting these to Float16 looks a little confusing, i.e., Float1664Vector, Float16128Vector, etc Kindly suggest a better name to represent these classes. > >> > > Some quick comments. >> > > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. >> > >> > >> > I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. >> >> There are nomenclature issues that I am facing. Currently, all the Float16 concrete classes use the Halffloat prefix i.e., Halffloat64Vector, Halffloat128Vector; converting these to Float16 looks a little confusing, i.e., Float1664Vector, Float16128Vector, etc Kindly suggest a better name to represent these classes. > > Maybe we move the shape to the end e.g., `Float16Vector128`, `IntVector128`, `IntVectorMax`? > Hi @PaulSandoz , Thanks for your comments. Please find below my responses. > > > When you generate the fallback code for unary/binary etc can you push the carrier type and conversations into the uOp/bOp implementations so you don't have to explicitly operate on the carrier type and do the conversions as you do now e.g.,: > > ``` > > v0.uOp(m, (i, a) -> float16ToShortBits(Float16.valueOf(-(shortBitsToFloat16(($type$)a).floatValue())))); > > ``` > > Currently, uOp and uOpTemplates are part of the scaffolding logic and are sacrosanct; they are shared by various abstracted vector classes, and their semantics are defined by the lambda expression. I agree that explicit conversion in lambdas looks verbose, but moving them to uOpTemplate may fracture the lambda expression such that part of its semantics, i.e,. conversions, will seep into uOpTemplate, while what will appear at the surface will be the expression operating over primitive float values; this may become very confusing. Since the uOpTemplate etc are per element vector type it seems straightforward to adjust the template to perform the conversion before and after the function application, or add a default method to FUnOp etc that operates on the carrier value and performs the conversions and the template calls that default method. Later we will eventually be able to declare Float16![] and it should all collapse away. > > > Requiring two arguments means they can get out of sync. Previously the class provided all the information needed, now > > arguably the type does. > > Yes, from the compiler standpoint point all we care about is the carrier type, which determines the vector lane size. This is augmented with operation kind (PRIM / FP16) to differentiate a short vector lane from a float16 vector lane. Apart from this, we need to pass the VectorBox type to wrap the vector IR. The basic type codes are declared and shared across Java and HotSpot - it's used in `LaneType`. Can we pass a single argument that is the basic type instead of two arguments. HotSpot should know from the basic type what the carrier class and also what the operation type without it being explicitly told, since presumably it knew the inverse - the basic type from the element class. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3523722566 From eosterlund at openjdk.org Wed Nov 12 22:25:04 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 12 Nov 2025 22:25:04 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 16:52:20 GMT, Evgeny Astigeevich wrote: > Hi Erik (@fisk), > > Could you also please take a look, just in case the fence was intentionally put there? The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3524155679 From sspitsyn at openjdk.org Wed Nov 12 23:36:02 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 12 Nov 2025 23:36:02 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS In-Reply-To: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: On Mon, 10 Nov 2025 20:54:56 GMT, Alex Menkov wrote: > FollowReferences with null initial_object starts heap walking from "heap roots", which include system classes. > All oops from ClassLoaderDataGraph are reported with JVMTI_HEAP_REFERENCE_SYSTEM_CLASS kind, but some of the objects are not classes. > The fix updates FollowReferences to report non-class objects from ClassLoaderDataGraph as JVMTI_HEAP_REFERENCE_OTHER > > Testing: tier1..4,hs-tier5-svc The fix looks good. I've posted a couple of nits. src/hotspot/share/prims/jvmtiTagMap.cpp line 2193: > 2191: }; > 2192: > 2193: // A supporting closure used to process ClassLoaderData roots Nit: Need dot at the end of comment. test/hotspot/jtreg/serviceability/jvmti/FollowReferences/KindSystemClass/libKindSystemClass.cpp line 79: > 77: > 78: for (int i = 0; i < class_counter; i++) { > 79: tags[i] = i+1; Nit: Need spaces around `+` sign. test/hotspot/jtreg/serviceability/jvmti/FollowReferences/KindSystemClass/libKindSystemClass.cpp line 101: > 99: } > 100: > 101: Nit: Unneeded extra empty line. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28224#pullrequestreview-3455948919 PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2520165317 PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2520125928 PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2520132469 From rriggs at openjdk.org Wed Nov 12 23:47:02 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Wed, 12 Nov 2025 23:47:02 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. src/java.base/share/native/libjimage/imageFile.cpp line 335: > 333: _index_data = (u1*)osSupport::map_memory(_fd, _name, 0, (size_t)map_size()); > 334: if (_index_data == nullptr) { > 335: return false; Indentation in this file is 4 spaces. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28087#discussion_r2520210679 From dlong at openjdk.org Thu Nov 13 01:22:13 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 13 Nov 2025 01:22:13 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages It looks good but I need to run testing. ------------- PR Review: https://git.openjdk.org/jdk/pull/27279#pullrequestreview-3456522254 From ysuenaga at openjdk.org Thu Nov 13 04:32:15 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Thu, 13 Nov 2025 04:32:15 GMT Subject: Integrated: 8371093: Assert "section header string table should be loaded" failed on debug VM In-Reply-To: References: Message-ID: <23bhO2TY9lB27WoGDvXinsr8-N2DjjLZRpsuHdXLbZo=.34156c62-6d67-46a5-a99e-3361e38340aa@github.com> On Sun, 2 Nov 2025 06:27:50 GMT, Yasumasa Suenaga wrote: > When the crash happens in the function in vDSO on Linux, native call stacks in hs_err log wouldn't be generated as following. See [hs_err log on JBS](https://bugs.openjdk.org/secure/attachment/116796/hs_err_pid4018.log) for details. Reproducer is also attached on JBS ([Test.java](https://bugs.openjdk.org/secure/attachment/116797/Test.java)) > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > C [linux-vdso.so.1+0xe69] > [error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536)] > > Retrying call stack printing without source information... > > [error occurred during error reporting (retry printing native stack (no source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fba8075f791] > > > When I checked back trace on GDB, it failed at `assert`. > > #12 0x00007fba7e76bd00 in report_vm_error (file=file at entry=0x7fba7fed7b40 "/home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp", > line=line at entry=536, error_msg=error_msg at entry=0x7fba80019575 "assert(false) failed", > detail_fmt=detail_fmt at entry=0x7fba7fed7bf0 "section header string table should be loaded") > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/debug.cpp:196 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > #14 ElfFile::read_debug_info (this=this at entry=0x7fba782a1650, debug_info=debug_info at entry=0x7fba7dd05150) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:407 > > > > (gdb) f 13 > #13 0x00007fba7e886eb3 in ElfFile::read_section_header (this=0x7fba782a1650, name=0x7fba800367d4 ".gnu_debuglink", hdr=...) > at /home/yasuenag/github-forked/jdk/src/hotspot/share/utilities/elfFile.cpp:536 > 536 assert(false, "section header string table should be loaded"); > > > vDSO is not a regular ELF, so it should be skipped here. This pull request has now been integrated. Changeset: b6ba1ac9 Author: Yasumasa Suenaga URL: https://git.openjdk.org/jdk/commit/b6ba1ac9aa800e01e2235c2b8737ad4670b0a655 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod 8371093: Assert "section header string table should be loaded" failed on debug VM Reviewed-by: phubner, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/28102 From kbarrett at openjdk.org Thu Nov 13 05:58:25 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 13 Nov 2025 05:58:25 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v2] In-Reply-To: References: Message-ID: > 8369187: Add wrapper for that forbids use of global allocation and deallocation functions > > Please review this change that adds `cppstdlib/new.hpp` as a wrapper for > including ``. All existing inclusions of `` are changed to include > the new wrapper. > > In additional to including ``, this wrapper also provides deprecation > declarations to prevent the use of some facilities by HotSpot code. > > However, those deprecations need to be conditionalized to not apply to gtests, > so this change also adds a macro definition provided by the build system for > use in detecting that a header is being included by a gtest. > > Testing: mach5 tier1 Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into wrap-stdlib-new - further conditionalize deprecation of hardare interference sizes - add wrapper for ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28250/files - new: https://git.openjdk.org/jdk/pull/28250/files/103f7c2b..11c088e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28250&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28250&range=00-01 Stats: 6550 lines in 64 files changed: 2791 ins; 3375 del; 384 mod Patch: https://git.openjdk.org/jdk/pull/28250.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28250/head:pull/28250 PR: https://git.openjdk.org/jdk/pull/28250 From kbarrett at openjdk.org Thu Nov 13 05:58:25 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 13 Nov 2025 05:58:25 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v2] In-Reply-To: References: <6SO4sjrZNkftwWmo-9j7fPvB5K7e24kpCqt5UzSWWqQ=.bbb0560f-2e54-4f68-a67f-10fa35bf0afb@github.com> Message-ID: On Wed, 12 Nov 2025 13:13:04 GMT, Stefan Karlsson wrote: >> It can't be redeclared with the `[[deprecated]]` attribute using that form. >> `constexpr` requires an initializer, and what should the value be? And all >> `inline` declarations need to be "exactly the same" (which has a technical >> meaning somewhere that talks about equivalent token sequences). >> >> Removing `extern`, adding `inline`, or both leads to gcc to (quite correctly, >> I think) rejecting it as a redefinition. >> >> I think the form being used here does have the same storage class. I think >> both forms declare a variable with namespace scope and external linkage; C++17 >> 6.5. And both gcc and clang accept it. I _think_ it's an MSVC bug of being >> overly restrictive, rather than both gcc and clang being overly permissive. > > I'm still note convinced that the above does what you intended to do here, but the result seems to be the same so I guess that's fine. > > If I compile a small test: > > #include > > [[deprecated]] inline constexpr int my_deprecated = 1; > > int main() { > printf("my_deprecated: %d\n", my_deprecated); > return 0; > } > > I get: > > $ g++ -std=c++17 -Wall -Wextra -pedantic h.cpp > h.cpp:6:33: warning: 'my_deprecated' is deprecated [-Wdeprecated-declarations] > 6 | printf("my_deprecated: %d\n", my_deprecated); > | ^ > h.cpp:3:3: note: 'my_deprecated' has been explicitly marked deprecated here > 3 | [[deprecated]] inline constexpr int my_deprecated = 1; > | ^ > 1 warning generated. > > > But if I try to compile something similar to the above code I don't get the deprecated warning: > > #include > #include > > namespace std { > [[deprecated]] extern const size_t hardware_destructive_interference_size; > }; > > int main() { > printf("x = %zu\n", std::hardware_destructive_interference_size); > return 0; > } > > Now I get: > > $ g++ -std=c++17 -Wall -Wextra -pedantic g.cpp > g.cpp:9:28: error: reference to 'hardware_destructive_interference_size' is ambiguous > 9 | printf("x = %zu\n", std::hardware_destructive_interference_size); > | ~~~~~^ > /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__new/interference_size.h:25:25: note: candidate found by name lookup is 'std::__1::hardware_destructive_interference_size' > 25 | inline constexpr size_t hardware_destructive_interference_size = __GCC_DESTRUCTIVE_SIZE; > | ^ > g.cpp:5:36: note: candidate found by name lookup is 'std::hardware_destructive_interference_size' > 5 | [[deprecated]] extern const size_t hardware_destructive_interference_size; > | ^ > 1 error generated. > > where the end result is that we get a warning about an ambiguous lookup and not a warning that this has been deprecated. Getting the desired behavor seems to require a sufficiently recent compiler. Changed to only redeclare these variables deprecated accordingly. It's kind of messy. We could just skip these deprecations, though having put in the work to find good versions... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28250#discussion_r2521668787 From alanb at openjdk.org Thu Nov 13 07:09:03 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 13 Nov 2025 07:09:03 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 06:43:42 GMT, Jaikiran Pai wrote: >> Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. > > src/java.base/share/native/libjimage/imageFile.cpp line 334: > >> 332: // Memory map image (minimally the index.) >> 333: _index_data = (u1*)osSupport::map_memory(_fd, _name, 0, (size_t)map_size()); >> 334: if (_index_data == nullptr) { > > The rest of the code in the `libjimage` library uses `NULL`, including the return value in `osSupport::map_memory(...)`. So I think it would be better to use `NULL` here for consistency. I agree, have a mix is annoying. In any case, changing the assert to have it fail looks right. The jimage file is opened during startup so less likely, but still possible, that this mmap fails. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28087#discussion_r2521919635 From duke at openjdk.org Thu Nov 13 07:58:14 2025 From: duke at openjdk.org (Ruben) Date: Thu, 13 Nov 2025 07:58:14 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: <-h6G9ajUWQwDRcUMOtyI_YCUCkXz3pzRggJk_UaxM-0=.a8c772aa-2f09-48c0-9cfb-17e624393eb0@github.com> On Fri, 7 Nov 2025 11:07:40 GMT, Ruben wrote: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. I am planning to update this PR today to include the comments. However, I still have not identified a way to ensure the deopt handler stub ends at a page boundary in a unit test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3526178620 From kbarrett at openjdk.org Thu Nov 13 08:36:17 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 13 Nov 2025 08:36:17 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations [v7] In-Reply-To: References: <2loBZkzSWVjbUw6sUMMj86o-L6cSlEqO_bJI1qCOPxM=.505020dc-34f2-4586-8563-2b7330e6766a@github.com> Message-ID: On Wed, 12 Nov 2025 06:27:59 GMT, Axel Boldt-Christmas wrote: > Looks good. > > The gtest still uses a mix of the x86 instruction and the new `compare_exchange` / `exchange` nomenclature. Feel free to unify this, or leave it as is (can always fix this later). I'm going to leave that for later, so I don't need to get y'all to re-review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27539#issuecomment-3526437172 From kbarrett at openjdk.org Thu Nov 13 08:36:25 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 13 Nov 2025 08:36:25 GMT Subject: RFR: 8370333: hotspot-unit-tests.md specifies wrong directory structure for tests In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 07:41:28 GMT, Stefan Karlsson wrote: >> Please review this change to the HotSpot unit test documentation, fixing the >> path where native tests are located. > > Marked as reviewed by stefank (Reviewer). Thanks for reviews @stefank and @albertnetymk ------------- PR Comment: https://git.openjdk.org/jdk/pull/28251#issuecomment-3526423827 From kbarrett at openjdk.org Thu Nov 13 08:36:26 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 13 Nov 2025 08:36:26 GMT Subject: Integrated: 8370333: hotspot-unit-tests.md specifies wrong directory structure for tests In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 06:14:47 GMT, Kim Barrett wrote: > Please review this change to the HotSpot unit test documentation, fixing the > path where native tests are located. This pull request has now been integrated. Changeset: 795ec5c1 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/795ec5c1e90309bc008acb28cfe0ce039dabcb8f Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod 8370333: hotspot-unit-tests.md specifies wrong directory structure for tests Reviewed-by: stefank, ayang ------------- PR: https://git.openjdk.org/jdk/pull/28251 From kbarrett at openjdk.org Thu Nov 13 08:39:15 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 13 Nov 2025 08:39:15 GMT Subject: RFR: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations In-Reply-To: References: Message-ID: <8nO1s891ixoJF8cRVds9mY6fcmgHCjjlNZfE-U4rSjU=.dfef45c3-0bb7-47cd-b6da-b7d599034e81@github.com> On Fri, 3 Oct 2025 11:48:46 GMT, Stefan Karlsson wrote: >> I agree the similarity between `relaxed_store` and `release_store` isn't >> ideal. I think unqualified `load` and `store` is worse. >> >>> Atomic r-m-w operations (pretty much everything except base load/store) were >>> required to have full bi-directional fence semantics from "day one". That >>> doesn't make sense for base load/store so they are "relaxed" by default. >> >> That's only true in the HotSpot API. It's not true for C++ or the C >> equivalent, or in most published papers, where the default for "atomic" loads >> and stores is sequentially consistent. So we're weird that way. We don't even >> directly have those operations (though they can probably be more or less >> simulated using RMW operations). >> >>> I personally hate the term "relaxed" as it suggests to me a removal of >>> "normal" ordering, when really it means the absence of a stronger ordering. >> >> I agree that "relaxed" doesn't seem like the greatest of terms. But I think >> it's the generally accepted term in this area. At least, it's what's used for >> C/C++ and friends. (But maybe I'm wrong about it being generally >> accepted? Looking through some papers, I'm not always finding that >> nomenclature.) >> >> So it might be a mistake to invent our own terminology. But for the sake of >> discussion I'll throw out one idea: "unordered". >> >> (I think "atomic" has the right technical meaning (indivisible, i.e. no >> tearing, but doesn't imply ordering), but is probably too confusing. It would >> also require something different for the proposed `atomic_inc()` and >> `atomic_dec()` member functions.) >> >> How do we resolve this? I feel like this whole PR is stuck here. > >> How do we resolve this? I feel like this whole PR is stuck here. > > Just a reminder that I wrote: >> The following isn't a strong request, but maybe more of stated preference and an inquiry what other HotSpot devs think about the proposed names > > I really would have liked to hear from other Reviewers what they think. So, I've been waiting for others to chime in. If I'm really alone in my lack of enthusiasm for the names `relaxed_store` / `load_releaxed` then so be it. > > If you are going to skip the names Atomic::load/store will you also update the names AtomicAccess::load/store in a follow-up RFE? Thanks for reviews @stefank , @xmas92 , and @jdksjolen ------------- PR Comment: https://git.openjdk.org/jdk/pull/27539#issuecomment-3526453895 From kbarrett at openjdk.org Thu Nov 13 08:47:46 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 13 Nov 2025 08:47:46 GMT Subject: Integrated: 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations In-Reply-To: References: Message-ID: On Sun, 28 Sep 2025 11:10:41 GMT, Kim Barrett wrote: > Please review this change that adds the type Atomic, to use as the type > of a variable that is accessed (including writes) concurrently by multiple > threads. This is intended to replace (most) uses of the current HotSpot idiom > of declaring a variable volatile and accessing that variable using functions > from the AtomicAccess class. > https://github.com/openjdk/jdk/blame/528f93f8cb9f1fb9c19f31ab80c8a546f47beed2/doc/hotspot-style.md#L138-L147 > > This change replaces https://github.com/openjdk/jdk/pull/27462. Differences are > > * Substantially restructured `Atomic`, to be IDE friendly. It's > operationally the same, with the same API, hence uses and gtests didn't need > to change in that respect. Thanks to @stefank for raising this issue, and for > some suggestions toward improvements. > > * Changed how fetch_then_set for atomic translated types is handled, to avoid > having the function there at all if it isn't usable, rather than just removing > it via SFINAE, leaving an empty overload set. > > * Added more gtests. > > Testing: mach5 tier1-6, GHA sanity tests This pull request has now been integrated. Changeset: 10220ed0 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/10220ed06ea452083693406113107484fce40275 Stats: 1236 lines in 7 files changed: 1197 ins; 0 del; 39 mod 8367013: Add Atomic to package/replace idiom of volatile var plus AtomicAccess:: operations Reviewed-by: stefank, aboldtch, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/27539 From jbhateja at openjdk.org Thu Nov 13 09:31:03 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 13 Nov 2025 09:31:03 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer In-Reply-To: References: Message-ID: <8hStIcvp252Ik7raxZL5BvFKKkXTflorjyOD9Cyakvc=.c5d1b302-5c49-46b1-91ba-2feda2e6a746@github.com> On Wed, 12 Nov 2025 20:11:06 GMT, Paul Sandoz wrote: > The basic type codes are declared and shared across Java and HotSpot - it's used in `LaneType`. Can we pass a single argument that is the basic type instead of two arguments. HotSpot should know from the basic type what the carrier class and also what the operation type without it being explicitly told, since presumably it knew the inverse - the basic type from the element class. Hi @PaulSandoz, T_HALFFLOAT used in LaneType is mainly used for differentiation of various cache keys used by conversion operation lookups. In principle, we can extend VM to acknowledge this new custom basic type on the lines of T_METADATA / T_ADDRESS; its scope for now will be restricted to VectorSupport. We can gradually expose this to C2 type, such that TypeVect for all Float16 VectorIR uses T_HALFFLOAT as its basic type; currently, we use T_SHORT as the lane type. Let me know if this looks reasonable ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3526715585 From jbhateja at openjdk.org Thu Nov 13 09:31:04 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 13 Nov 2025 09:31:04 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer In-Reply-To: References: Message-ID: <15AReOBUAseO-BiCWHW7N-OSOcknDc0Box3c90cXRZU=.5d7341db-94ea-4cdf-b3cd-fabe414dd88d@github.com> On Wed, 12 Nov 2025 19:48:54 GMT, Paul Sandoz wrote: > > > > Some quick comments. > > > > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. > > > > > > > > > I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. > > > > > > There are nomenclature issues that I am facing. Currently, all the Float16 concrete classes use the Halffloat prefix i.e., Halffloat64Vector, Halffloat128Vector; converting these to Float16 looks a little confusing, i.e., Float1664Vector, Float16128Vector, etc Kindly suggest a better name to represent these classes. > > Maybe we move the shape to the end e.g., `Float16Vector128`, `IntVector128`, `IntVectorMax`? This looks good, since all these are concrete vector classes not exposed to users. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3526723445 From mdoerr at openjdk.org Thu Nov 13 09:34:05 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 13 Nov 2025 09:34:05 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Tue, 11 Nov 2025 17:50:43 GMT, Albert Mingkun Yang wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review We have seen crashes on many platforms (including x64) while running `make run-test TEST=runtime/cds/appcds/aotClassLinking/LambdaInExcludedClass.java JTREG="VM_OPTIONS=-XX:+UseCompactObjectHeaders"`: SIGSEGV (0xb) at pc=0x00007f2f95a61e7a, pid=18554, tid=18557 V [libjvm.so+0x15bfe7a] MemAllocator::finish(HeapWordImpl**) const+0xca (klass.inline.hpp:72) V [libjvm.so+0x15c029f] ObjAllocator::initialize(HeapWordImpl**) const+0x2f (memAllocator.cpp:391) V [libjvm.so+0xb0630b] CollectedHeap::fill_with_object(HeapWordImpl**, unsigned long, bool)+0x27b (collectedHeap.cpp:491) V [libjvm.so+0x1c7a0bb] ThreadLocalAllocBuffer::retire(ThreadLocalAllocStats*)+0x11b (threadLocalAllocBuffer.cpp:118) V [libjvm.so+0x15c0b14] MemAllocator::mem_allocate_inside_tlab_slow(MemAllocator::Allocation&) const+0x84 (memAllocator.cpp:286) V [libjvm.so+0x15c13ab] MemAllocator::mem_allocate(MemAllocator::Allocation&) const+0xbb (memAllocator.cpp:340) V [libjvm.so+0x15c14f9] MemAllocator::allocate() const+0xa9 (memAllocator.cpp:353) V [libjvm.so+0x1cc052e] TypeArrayKlass::allocate_common(int, bool, JavaThread*)+0x13e (collectedHeap.inline.hpp:41) V [libjvm.so+0x16fbc98] oopFactory::new_typeArray(BasicType, int, JavaThread*)+0x38 (typeArrayKlass.hpp:51) V [libjvm.so+0x106b0f3] java_lang_Class::restore_archived_mirror(Klass*, Handle, Handle, Handle, JavaThread*)+0x413 (javaClasses.cpp:1246) V [libjvm.so+0x14100bc] Klass::restore_unshareable_info(ClassLoaderData*, Handle, JavaThread*)+0x66c (klass.cpp:903) V [libjvm.so+0xfe2cb1] InstanceKlass::restore_unshareable_info(ClassLoaderData*, Handle, PackageEntry*, JavaThread*)+0x81 (instanceKlass.cpp:2823) V [libjvm.so+0x1c0f5ad] SystemDictionary::preload_class(Handle, InstanceKlass*, JavaThread*)+0x1ed (systemDictionary.cpp:1198) V [libjvm.so+0x676e83] AOTLinkedClassBulkLoader::preload_classes_in_table(Array*, char const*, Handle, JavaThread*)+0x1a3 (aotLinkedClassBulkLoader.cpp:103) V [libjvm.so+0x679af5] AOTLinkedClassBulkLoader::preload_classes_impl(JavaThread*)+0x165 (aotLinkedClassBulkLoader.cpp:76) V [libjvm.so+0x67c371] AOTLinkedClassBulkLoader::preload_classes(JavaThread*)+0x11 (aotLinkedClassBulkLoader.cpp:61) V [libjvm.so+0x1d5bf30] vmClasses::resolve_all(JavaThread*)+0x3e0 (vmClasses.cpp:126) V [libjvm.so+0x1c0f28c] SystemDictionary::initialize(JavaThread*)+0x10c (systemDictionary.cpp:1623) V [libjvm.so+0x1cc74ca] Universe::genesis(JavaThread*)+0xfa (universe.cpp:451) V [libjvm.so+0x1ccbbf5] universe2_init()+0x35 (universe.cpp:1119) V [libjvm.so+0xfd5709] init_globals2()+0x9 (init.cpp:173) V [libjvm.so+0x1c926b1] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3a1 (threads.cpp:622) V [libjvm.so+0x118b634] JNI_CreateJavaVM+0x54 (jni.cpp:3591) C [libjli.so+0x3d7f] JavaMain+0x8f (java.c:1506) C [libjli.so+0x7ad9] ThreadJavaMain+0x9 (java_md.c:646) ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3526748535 From dlong at openjdk.org Thu Nov 13 10:05:29 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 13 Nov 2025 10:05:29 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages Testing passed. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27279#pullrequestreview-3458860722 From aph at openjdk.org Thu Nov 13 10:12:05 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 13 Nov 2025 10:12:05 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: <9xH_awrfgZ2Gxmc_G37aiOquAfDvPGs80KDvqxMs1z8=.f80e5655-3416-4c39-8bd1-5b0cc6fe5f64@github.com> On Wed, 12 Nov 2025 22:22:29 GMT, Erik ?sterlund wrote: > The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. Understood. But there are two caches, and `OrderAccess::fence` does not affect icache. So `OrderAccess::fence` cannot do anything to help. in order to make sure the buffered store hits the icache we need `DSB; ISB`, which `OrderAccess::fence` does. On the other hand, the cost of `OrderAccess::fence` is small in comparison with `ICache::invalidate_word`, so there's a question about why we're bothering to remove it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3526953448 From ayang at openjdk.org Thu Nov 13 11:38:20 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 13 Nov 2025 11:38:20 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Thu, 13 Nov 2025 09:31:53 GMT, Martin Doerr wrote: > ... make run-test TEST=runtime/cds/appcds/aotClassLinking/LambdaInExcludedClass.java JTREG="VM_OPTIONS=-XX:+UseCompactObjectHeaders" I suspect the crash is caused by a preexisting issue that is exposed by this patch. In `vmClasses::resolve_all`: #if INCLUDE_CDS if (CDSConfig::is_using_aot_linked_classes()) { AOTLinkedClassBulkLoader::preload_classes(THREAD); } #endif // Preload commonly used klasses vmClassID scan = vmClassID::FIRST; // first do Object, then String, Class resolve_through(VM_CLASS_ID(Object_klass), scan, CHECK); CollectedHeap::set_filler_object_klass(vmClasses::Object_klass()); The filler-klass is not initialized when `preload_classes` is invoked, but `preload_classes` use heap-allocation, which may require filler-obj. @iklam What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3527395697 From duke at openjdk.org Thu Nov 13 11:50:52 2025 From: duke at openjdk.org (David Beaumont) Date: Thu, 13 Nov 2025 11:50:52 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. Returning false here isn't an immediate hard-fail (unlike the assertion), so there is now a situation where `mmap()` failing will allow the VM startup code to continue running for longer than before. In particular, having this return false means that `ClassLoader::lookup_vm_options()` returns false, and skips the parsing of options from the jimage file. I'm not 100% clear if it would later fail, or just attempt to run in "exploded" mode, possibly yielding an unexpected/undefined state. There are two methods in `ClassPathImageEntry` related to this: `ClassPathImageEntry::jimage_non_null()` which will assert that the image was opened. `ClassPathImageEntry::jimage()` which will not. Depending on who calls these, and in what order, the JVM startup might now reach code it wouldn't have before when/if `mmap()` failed. However this is no different to other ways in which the jimage open code can return false (esp. just not having the file there) so I don't think this, slight, change in possible behaviour incurs any more risk of getting into an odd state than was already present. ------------- PR Review: https://git.openjdk.org/jdk/pull/28087#pullrequestreview-3459404369 From eastigeevich at openjdk.org Thu Nov 13 11:59:55 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 13 Nov 2025 11:59:55 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 22:22:29 GMT, Erik ?sterlund wrote: >> Hi Erik (@fisk), >> Could you also please take a look, just in case the fence was intentionally put there? > >> Hi Erik (@fisk), >> >> Could you also please take a look, just in case the fence was intentionally put there? > > The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. > > It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. > > If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence. @fisk: > The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. Such hardware would violate Arm ARM: - B2.3 Ordering requirements defined by the formal concurrency model: Same-Cache-Line-ordered-before. - B2.7.4.2 Synchronization and coherency issues between data and instruction accesses. Yes, it might be a bug in hardware. Neoverse-N1 errata 1542419 is a good example. In such a case, this bug should be handled inside `ICache`, if possible. If not possible, there should be something explaining why it's not in `ICache`. > It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. The issue is that we don't have this fence at other places, where `ICache::invalidate` are used. A reasonable question would be: why don't we use it there? > If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? 10, if we assume correct AArch64 implementation and correct `ICache::invalidate`. @theRealAph > On the other hand, the cost of OrderAccess::fence is small in comparison with `ICache::invalidate_word`, so there's a question about why we're bothering to remove it. This change is not about performance. It's about logical inconsistency: not using this everywhere, absence of history and contradiction to Arm ARM. Also, an assumption of a needed fence complicates a fix of [JDK-8370947](https://bugs.openjdk.org/browse/JDK-8370947). See `ICacheInvalidationContext::fence` in Alex's solution: https://github.com/openjdk/jdk/compare/master...xmas92:jdk:deferred_icache_invalidation ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3527471862 From alanb at openjdk.org Thu Nov 13 12:19:46 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 13 Nov 2025 12:19:46 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. In an images build, the VM has to fail during startup if the jimage file cannot be opened. Maybe @jcking could paste in the output from both release and fastdebug builds. It may be that it trips an assert earlier with debug builds. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28087#issuecomment-3527546988 From alanb at openjdk.org Thu Nov 13 13:31:52 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 13 Nov 2025 13:31:52 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v11] In-Reply-To: References: Message-ID: <1QetBwPtcolO6_qPQSSRdSofa79BD4UNn9Ri68DGryo=.3dc36aee-0e70-4f88-a5e9-0ac0de3ed121@github.com> On Wed, 12 Nov 2025 12:51:37 GMT, Viktor Klang wrote: >> Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: >> >> - Remove dup end body tag >> - Change FinalFieldMutationEvent so that caller is top frame in stack trace >> - Merge branch 'master' into JDK-8353835 >> - Review feedback: Add tests for setting internal properties, improve links in Mutation methods page >> - Merge branch 'master' into JDK-8353835 >> - Merge branch 'master' into JDK-8353835 >> - Fix typo in test comment >> - Merge branch 'master' into JDK-8353835 >> - Merge branch 'master' into JDK-8353835 >> - Suppress warnings from some tests >> - ... and 40 more: https://git.openjdk.org/jdk/compare/2902436f...b22947c7 > > src/java.base/share/classes/java/lang/reflect/Field.java line 1543: > >> 1541: * the given possibly-null caller. >> 1542: */ >> 1543: private String finalFieldMutationWarning(Class caller, boolean unreflect) { > > It may make sense to have this method return a StringBuilder instance (and use it internally before returning it) as that would cut down on extra String-instance creation. We could but this is the slow path that prints the warning on the first final field mutation. So I think I'll keep it as simple as possible. > src/java.base/share/man/java.md line 482: > >> 480: >> 481: - `allow`: This mode allows illegal final field mutation in all modules, >> 482: without any warings. > > Suggestion: > > without any warnings. Well spotted, this text was copied down `--illegal-native-access` so I'll fix the typo in both places. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2523476042 PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2523469262 From duke at openjdk.org Thu Nov 13 14:03:09 2025 From: duke at openjdk.org (David Beaumont) Date: Thu, 13 Nov 2025 14:03:09 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: <30va--sx1H0YORpDccz1-rcaBpHX8znPZxuuMOhVc88=.758667f5-a4e4-4404-a539-053c95350cf3@github.com> On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. If it must fail if the jimage file cannot be opened, that should mean it must fail when the file is not present, unless the existence of the file is what defines "an images build". If "is an images build" is meant to be a synonym for "the jimage file exists on disk", then this change might create a case where one observer decides "the is an images build" (by looking for the file) but the JVM behaves as if it's an "exploded" build and loads classes from elsewhere (though equally, this might not be possible based on other code paths). The jimage file existing on disk is something that the JVM build/installation controls, but whether mmap works is something the underlying system controls. This change conflates the two things in terms of subsequent behaviour, and I'm not 100% sure that's what we want. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28087#issuecomment-3527950696 From tschatzl at openjdk.org Thu Nov 13 14:17:00 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 13 Nov 2025 14:17:00 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Tue, 11 Nov 2025 17:50:43 GMT, Albert Mingkun Yang wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Change looks good, but these AOT-related crashes should be fixed first. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28240#pullrequestreview-3460006822 From kevinw at openjdk.org Thu Nov 13 14:29:16 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 13 Nov 2025 14:29:16 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v10] In-Reply-To: References: Message-ID: On Wed, 29 Oct 2025 18:08:31 GMT, Jonas Norlinder wrote: >> Hi all, >> >> This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. >> >> `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. >> >> FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. >> >> Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. > > Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: > > Fix phohensee review comments Looks good. Just a question on whether we really need to specify the GCs in the tests. I was hoping we can not do this, it seems like more to maintain that should be automatic as long as we do test batches with the different collectors. Do we need available GCs as specified in the tests get run on every test, or can we afford to let that be decided by the framework? (There might be some tests where we do want to make sure they run with all GCs regardless of what jtreg command runs, but I would hope that is a small set.) ------------- Marked as reviewed by kevinw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27537#pullrequestreview-3460067101 From duke at openjdk.org Thu Nov 13 14:30:42 2025 From: duke at openjdk.org (Ivan) Date: Thu, 13 Nov 2025 14:30:42 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: <9i_KM_P6pgJgezDQVfEsdn9RI9-v5Ot-Ud0yg0z8ADA=.11e06d36-b0d6-49d1-a78d-489e539997ac@github.com> On Wed, 12 Nov 2025 18:43:20 GMT, Aleksey Shipilev wrote: > What kind of testing did you do? I ran tier1/tier2 tests and checked whether the issues related to linked bugs ([JDK-8330612](https://bugs.openjdk.org/browse/JDK-8330612), [JDK-8347071](https://bugs.openjdk.org/browse/JDK-8347071)) disappeared. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26525#issuecomment-3528079629 From alanb at openjdk.org Thu Nov 13 14:34:28 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 13 Nov 2025 14:34:28 GMT Subject: RFR: 8371048: ImageFileReader::open fails to check return value of osSupport::map_memory In-Reply-To: References: Message-ID: On Fri, 31 Oct 2025 14:00:38 GMT, Justin King wrote: > Check whether `osSupport::map_memory` actually succeeded in all compliation modes, instead of crashing shortly after in non-debug builds. Ideally we should fall back to just reading the entire file into memory manually or use seek+read, but this is good enough for now to avoid crashing. ClassLoader::setup_bootstrap_search_path_impl, and first use of ModuleFinder.ofSystem in early startup, attempt to stat lib/modules to determine if this is an images build. An images build is what jlink produces and always has a lib/modules file. When developers download a JDK it is an images build. In the JDK build itself there is an immediate "exploded" build but that isn't really used after the JDK is fully built. It would be good to exercise this code to see how VM startup behaves when JIMAGE_Open returns NULL. It's possible that the fastdebug build will trip on an assert early. For the exercise then it would be good to check both release and fastdebug builds. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28087#issuecomment-3528098276 From aph at openjdk.org Thu Nov 13 15:13:35 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 13 Nov 2025 15:13:35 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: <9xH_awrfgZ2Gxmc_G37aiOquAfDvPGs80KDvqxMs1z8=.f80e5655-3416-4c39-8bd1-5b0cc6fe5f64@github.com> References: <9xH_awrfgZ2Gxmc_G37aiOquAfDvPGs80KDvqxMs1z8=.f80e5655-3416-4c39-8bd1-5b0cc6fe5f64@github.com> Message-ID: <5K-KvC57lCxzD-IGsaYR1MiSCjcBXUgJx2L8piup04o=.b250339d-4a3d-4e5a-a1b0-fcf0584e5e7a@github.com> On Thu, 13 Nov 2025 10:09:42 GMT, Andrew Haley wrote: >>> Hi Erik (@fisk), >>> >>> Could you also please take a look, just in case the fence was intentionally put there? >> >> The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. >> >> It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. >> >> If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence. > >> The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. > > [edited] > > Understood. But there are two caches, and `OrderAccess::fence` does not affect icache. So `OrderAccess::fence` cannot do anything to help. in order to make sure the buffered store hits the icache we need `DSB; ISB`, which `ICache::invalidate` does. > > On the other hand, the cost of `OrderAccess::fence` is small in comparison with `ICache::invalidate_word`, so there's a question about why we're bothering to remove it. > @theRealAph > > > On the other hand, the cost of OrderAccess::fence is small in comparison with `ICache::invalidate_word`, so there's a question about why we're bothering to remove it. > > This change is not about performance. It's about logical inconsistency: not using this everywhere, absence of history and contradiction to Arm ARM. I see. So there is little or no performance benefit, but we're doing this for reasons of formal consistency. > Also, an assumption of a needed fence complicates a fix of [JDK-8370947](https://bugs.openjdk.org/browse/JDK-8370947). See `ICacheInvalidationContext::fence` in Alex's solution: [master...xmas92:jdk:deferred_icache_invalidation](https://github.com/openjdk/jdk/compare/master...xmas92:jdk:deferred_icache_invalidation) I guess so, but there is no assumption of a needed fence. A question is whether some future Arm system with fully-coherent i- and d-caches might want to supply a weaker version of `ICache::invalidate_word`. But even if it did, it would at the very least have to be a `DMB`, so there isn't an issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3528240337 From aph at openjdk.org Thu Nov 13 15:13:36 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 13 Nov 2025 15:13:36 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: <3jXlz4RL0lU0eh-q_rpkh_79yErfSrwK19AZTD5HGrY=.1bcbee05-4887-4d89-a1e9-b279cc76d0ed@github.com> On Wed, 12 Nov 2025 22:22:29 GMT, Erik ?sterlund wrote: >> Hi Erik (@fisk), >> Could you also please take a look, just in case the fence was intentionally put there? > >> Hi Erik (@fisk), >> >> Could you also please take a look, just in case the fence was intentionally put there? > > The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. > > It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. > > If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence. @fisk , I'm assuming that no other thread is executing the target instructions while were patching them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3528258542 From lmesnik at openjdk.org Thu Nov 13 16:33:03 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 13 Nov 2025 16:33:03 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS In-Reply-To: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: On Mon, 10 Nov 2025 20:54:56 GMT, Alex Menkov wrote: > FollowReferences with null initial_object starts heap walking from "heap roots", which include system classes. > All oops from ClassLoaderDataGraph are reported with JVMTI_HEAP_REFERENCE_SYSTEM_CLASS kind, but some of the objects are not classes. > The fix updates FollowReferences to report non-class objects from ClassLoaderDataGraph as JVMTI_HEAP_REFERENCE_OTHER > > Testing: tier1..4,hs-tier5-svc Changes requested by lmesnik (Reviewer). test/hotspot/jtreg/serviceability/jvmti/FollowReferences/KindSystemClass/libKindSystemClass.cpp line 43: > 41: switch (reference_kind) { > 42: case JVMTI_HEAP_REFERENCE_SYSTEM_CLASS: > 43: *tag_ptr = ++class_counter; The callback is executed on VMThread, so counters should be atomic or protected by monitors. ------------- PR Review: https://git.openjdk.org/jdk/pull/28224#pullrequestreview-3460651314 PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2524138908 From iklam at openjdk.org Thu Nov 13 17:04:26 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 13 Nov 2025 17:04:26 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: <9o8TjNxCbwshwMuqpS-CyyhwNciw1WQ6w1_ijy39DEc=.fe11d68c-f925-4c07-9c46-42c9093a5448@github.com> On Thu, 13 Nov 2025 11:35:04 GMT, Albert Mingkun Yang wrote: > The filler-klass is not initialized when `preload_classes` is invoked, but `preload_classes` use heap-allocation, which may require filler-obj. > > @iklam What do you think? I am working on a fix now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3528798525 From macarte at openjdk.org Thu Nov 13 17:52:32 2025 From: macarte at openjdk.org (Mat Carter) Date: Thu, 13 Nov 2025 17:52:32 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v5] In-Reply-To: References: Message-ID: > Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. > > The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE > > It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: > > TRUE > FALSE > > Passes tier1 on linux (x64) and windows (x64) Mat Carter has updated the pull request incrementally with one additional commit since the last revision: Adding test to validate using DiagnosticCommand MBean to invoke AOT.end_recording ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28010/files - new: https://git.openjdk.org/jdk/pull/28010/files/d48a200f..bff7cb74 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=03-04 Stats: 129 lines in 1 file changed: 129 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28010/head:pull/28010 PR: https://git.openjdk.org/jdk/pull/28010 From macarte at openjdk.org Thu Nov 13 18:04:04 2025 From: macarte at openjdk.org (Mat Carter) Date: Thu, 13 Nov 2025 18:04:04 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v6] In-Reply-To: References: Message-ID: > Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. > > The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE > > It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: > > TRUE > FALSE > > Passes tier1 on linux (x64) and windows (x64) Mat Carter has updated the pull request incrementally with one additional commit since the last revision: Revert "Adding test to validate using DiagnosticCommand MBean to invoke AOT.end_recording" Commit was intended for parent branch (that this branch is based on) This reverts commit bff7cb7408554232c13a57bba10b67a9fd19b811. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28010/files - new: https://git.openjdk.org/jdk/pull/28010/files/bff7cb74..6a100586 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=04-05 Stats: 129 lines in 1 file changed: 0 ins; 129 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28010/head:pull/28010 PR: https://git.openjdk.org/jdk/pull/28010 From macarte at openjdk.org Thu Nov 13 18:58:53 2025 From: macarte at openjdk.org (Mat Carter) Date: Thu, 13 Nov 2025 18:58:53 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v7] In-Reply-To: References: Message-ID: > Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. > > The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE > > It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: > > TRUE > FALSE > > Passes tier1 on linux (x64) and windows (x64) Mat Carter has updated the pull request incrementally with one additional commit since the last revision: Incorporate changes from the CSR ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28010/files - new: https://git.openjdk.org/jdk/pull/28010/files/6a100586..b97a799f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=05-06 Stats: 67 lines in 1 file changed: 4 ins; 13 del; 50 mod Patch: https://git.openjdk.org/jdk/pull/28010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28010/head:pull/28010 PR: https://git.openjdk.org/jdk/pull/28010 From amenkov at openjdk.org Thu Nov 13 19:21:19 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 13 Nov 2025 19:21:19 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS In-Reply-To: References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: On Thu, 13 Nov 2025 16:30:28 GMT, Leonid Mesnik wrote: >> FollowReferences with null initial_object starts heap walking from "heap roots", which include system classes. >> All oops from ClassLoaderDataGraph are reported with JVMTI_HEAP_REFERENCE_SYSTEM_CLASS kind, but some of the objects are not classes. >> The fix updates FollowReferences to report non-class objects from ClassLoaderDataGraph as JVMTI_HEAP_REFERENCE_OTHER >> >> Testing: tier1..4,hs-tier5-svc > > test/hotspot/jtreg/serviceability/jvmti/FollowReferences/KindSystemClass/libKindSystemClass.cpp line 43: > >> 41: switch (reference_kind) { >> 42: case JVMTI_HEAP_REFERENCE_SYSTEM_CLASS: >> 43: *tag_ptr = ++class_counter; > > The callback is executed on VMThread, so counters should be atomic or protected by monitors. Not sure I follow. There is no concurrent access to the variables. FollowReferences is executed at safepoint (so the counters are updated by single thread). The values are read after FollowReference returns (i.e. after the safepoint) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2524652595 From eosterlund at openjdk.org Thu Nov 13 19:39:04 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 13 Nov 2025 19:39:04 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 22:22:29 GMT, Erik ?sterlund wrote: >> Hi Erik (@fisk), >> Could you also please take a look, just in case the fence was intentionally put there? > >> Hi Erik (@fisk), >> >> Could you also please take a look, just in case the fence was intentionally put there? > > The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. > > It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. > > If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence. > @fisk , I'm assuming that no other thread is executing the target instructions while were patching them. Indeed; no concurrent thread is executing the instructions being modified. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3529410356 From vpaprotski at openjdk.org Thu Nov 13 19:40:08 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Thu, 13 Nov 2025 19:40:08 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 16:38:49 GMT, Volodymyr Paprotski wrote: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" @ferakocz @ascarpino when you can spare some time, would appreciate a review (would like to get this into 26 if possible..) ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3529414018 From amenkov at openjdk.org Thu Nov 13 19:42:00 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 13 Nov 2025 19:42:00 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS [v2] In-Reply-To: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: > FollowReferences with null initial_object starts heap walking from "heap roots", which include system classes. > All oops from ClassLoaderDataGraph are reported with JVMTI_HEAP_REFERENCE_SYSTEM_CLASS kind, but some of the objects are not classes. > The fix updates FollowReferences to report non-class objects from ClassLoaderDataGraph as JVMTI_HEAP_REFERENCE_OTHER > > Testing: tier1..4,hs-tier5-svc Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28224/files - new: https://git.openjdk.org/jdk/pull/28224/files/1ee62064..efa0b538 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28224&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28224&range=00-01 Stats: 4 lines in 2 files changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28224/head:pull/28224 PR: https://git.openjdk.org/jdk/pull/28224 From amenkov at openjdk.org Thu Nov 13 19:46:33 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 13 Nov 2025 19:46:33 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS [v2] In-Reply-To: References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: <7wURnzxhUcvMAvFBs1-ADLcdjSxsAs53lfH-M-6eHKA=.74ffc3ca-c15f-4bdc-a640-4fe50869c36c@github.com> On Wed, 12 Nov 2025 23:32:29 GMT, Serguei Spitsyn wrote: >> Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: >> >> feedback > > src/hotspot/share/prims/jvmtiTagMap.cpp line 2193: > >> 2191: }; >> 2192: >> 2193: // A supporting closure used to process ClassLoaderData roots > > Nit: Need dot at the end of comment. Fixed > test/hotspot/jtreg/serviceability/jvmti/FollowReferences/KindSystemClass/libKindSystemClass.cpp line 79: > >> 77: >> 78: for (int i = 0; i < class_counter; i++) { >> 79: tags[i] = i+1; > > Nit: Need spaces around `+` sign. Fixed. > test/hotspot/jtreg/serviceability/jvmti/FollowReferences/KindSystemClass/libKindSystemClass.cpp line 101: > >> 99: } >> 100: >> 101: > > Nit: Unneeded extra empty line. removed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2524717090 PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2524716711 PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2524717427 From psandoz at openjdk.org Thu Nov 13 19:51:03 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Thu, 13 Nov 2025 19:51:03 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer In-Reply-To: <8hStIcvp252Ik7raxZL5BvFKKkXTflorjyOD9Cyakvc=.c5d1b302-5c49-46b1-91ba-2feda2e6a746@github.com> References: <8hStIcvp252Ik7raxZL5BvFKKkXTflorjyOD9Cyakvc=.c5d1b302-5c49-46b1-91ba-2feda2e6a746@github.com> Message-ID: On Thu, 13 Nov 2025 09:25:34 GMT, Jatin Bhateja wrote: > > The basic type codes are declared and shared across Java and HotSpot - it's used in `LaneType`. Can we pass a single argument that is the basic type instead of two arguments. HotSpot should know from the basic type what the carrier class and also what the operation type without it being explicitly told, since presumably it knew the inverse - the basic type from the element class. > > Hi @PaulSandoz, T_HALFFLOAT used in LaneType is mainly used for differentiation of various cache keys used by conversion operation lookups. In principle, we can extend VM to acknowledge this new custom basic type on the lines of T_METADATA / T_ADDRESS; its scope for now will be restricted to VectorSupport. We can gradually expose this to C2 type, such that TypeVect for all Float16 VectorIR uses T_HALFFLOAT as its basic type; currently, we use T_SHORT as the lane type. Let me know if this looks reasonable I am proposing something simpler, really as a temporary step until `Float16` becomes part of the `java.base` module. IIUC from the basic type we can reliably determine what the two arguments we currently passing are e.g., T_HALFFLOAT = { short.class, VECTOR_TYPE_FP16 }. So we don't need to pass two arguments, we can just pass one, the intrinsic can lookup the class and operation type kind. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3529452461 From lmesnik at openjdk.org Thu Nov 13 19:53:12 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 13 Nov 2025 19:53:12 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS [v2] In-Reply-To: References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: <_yYrCkPQbFXAsI5lfc4ErWuwr585DFf_5h2alB5nF5E=.2beeda93-1e9e-4694-9188-302e72c52f05@github.com> On Thu, 13 Nov 2025 19:42:00 GMT, Alex Menkov wrote: >> FollowReferences with null initial_object starts heap walking from "heap roots", which include system classes. >> All oops from ClassLoaderDataGraph are reported with JVMTI_HEAP_REFERENCE_SYSTEM_CLASS kind, but some of the objects are not classes. >> The fix updates FollowReferences to report non-class objects from ClassLoaderDataGraph as JVMTI_HEAP_REFERENCE_OTHER >> >> Testing: tier1..4,hs-tier5-svc > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > feedback Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28224#pullrequestreview-3461421197 From lmesnik at openjdk.org Thu Nov 13 19:53:14 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 13 Nov 2025 19:53:14 GMT Subject: RFR: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS [v2] In-Reply-To: References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: On Thu, 13 Nov 2025 19:18:38 GMT, Alex Menkov wrote: >> test/hotspot/jtreg/serviceability/jvmti/FollowReferences/KindSystemClass/libKindSystemClass.cpp line 43: >> >>> 41: switch (reference_kind) { >>> 42: case JVMTI_HEAP_REFERENCE_SYSTEM_CLASS: >>> 43: *tag_ptr = ++class_counter; >> >> The callback is executed on VMThread, so counters should be atomic or protected by monitors. > > Not sure I follow. There is no concurrent access to the variables. > FollowReferences is executed at safepoint (so the counters are updated by single thread). The values are read after FollowReference returns (i.e. after the safepoint) Thanks for explanation. Then it is no need to synchronize. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28224#discussion_r2524733585 From macarte at openjdk.org Thu Nov 13 19:55:24 2025 From: macarte at openjdk.org (Mat Carter) Date: Thu, 13 Nov 2025 19:55:24 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: > Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. > > The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE > > It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: > > TRUE > FALSE > > Passes tier1 on linux (x64) and windows (x64) Mat Carter has updated the pull request incrementally with one additional commit since the last revision: Remove single whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28010/files - new: https://git.openjdk.org/jdk/pull/28010/files/b97a799f..f4a4af61 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28010/head:pull/28010 PR: https://git.openjdk.org/jdk/pull/28010 From mli at openjdk.org Thu Nov 13 21:42:18 2025 From: mli at openjdk.org (Hamlin Li) Date: Thu, 13 Nov 2025 21:42:18 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization Message-ID: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Hi, This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. # Test ## Jtreg in progress... ## Performance Column names meanings: * p: with patch * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on * m: without patch * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on #### Average improvement NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) -- | -- | -- | -- 1.022782609 | 2.198717391 | 2.162673913 | 2.199 ------------- Commit messages: - remove unused test code - revert unrelated test change - revert unrelated vmaskcmp change - typo - typo - disable vectorization of CMoveFD by removing share code change - remove Zicond code for CMoveFD - clean stop - Merge branch 'vectorize-CMove-Bool' into vectorize-CMove-Bool-riscv-CMoveF-D - Merge branch 'master' into vectorize-CMove-Bool - ... and 38 more: https://git.openjdk.org/jdk/compare/405d5f7a...ec0d8cc4 Changes: https://git.openjdk.org/jdk/pull/28309/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357551 Stats: 4581 lines in 13 files changed: 4446 ins; 50 del; 85 mod Patch: https://git.openjdk.org/jdk/pull/28309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28309/head:pull/28309 PR: https://git.openjdk.org/jdk/pull/28309 From coleenp at openjdk.org Thu Nov 13 22:14:07 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 13 Nov 2025 22:14:07 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 11:17:22 GMT, Anton Artemov wrote: > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a functor by only one thread. > > Tested in tiers 1 - 5. I have some initial comments and I haven't really figured out why we want the SpinSingleSection version vs. the CAS that is there now. The CAS makes a lot more sense to me. I like this refactoring! src/hotspot/share/runtime/objectMonitor.hpp line 379: > 377: SetObjectStrongFunctor(OopHandle* object_strong, WeakHandle const* object); > 378: void operator()(); > 379: }; Does this need to be in the header file? src/hotspot/share/runtime/park.cpp line 95: > 93: { > 94: SpinCriticalSection scs(&ListLock); > 95: { Do we need these extra {} here and at line 98? src/hotspot/share/utilities/spinCriticalSection.cpp line 33: > 31: // short-duration critical sections where we're concerned > 32: // about native mutex_t or HotSpot Mutex:: latency. > 33: void SpinCriticalSectionHelper::SpinAcquire(volatile int* adr) { Now we can follow the coding style by calling this method spin_acquire. src/hotspot/share/utilities/spinCriticalSection.cpp line 40: > 38: // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. > 39: int ctr = 0; > 40: int Yields = 0; lower case y. src/hotspot/share/utilities/spinCriticalSection.hpp line 37: > 35: static void SpinAcquire(volatile int* Lock); > 36: static void SpinRelease(volatile int* Lock); > 37: static bool TrySpinAcquire(volatile int* Lock); All these names can now follow the coding style (snake case). src/hotspot/share/utilities/spinCriticalSection.hpp line 56: > 54: // A short section which is to be executed by only one thread. > 55: // The payload code is to be put into an object inherited from the Functor class. > 56: class SpinSingleSection { Could this instead be a template instead and the code can pass a lambda to it rather than this Functor type? test/hotspot/gtest/jfr/test_adaptiveSampler.cpp line 43: > 41: #include "runtime/atomicAccess.hpp" > 42: #include "utilities/globalDefinitions.hpp" > 43: #include "utilities/spinCriticalSection.hpp" Why this include? ------------- PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3461859596 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2525093915 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2525068401 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2525070046 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2525070847 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2525076533 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2525084252 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2525074123 From duke at openjdk.org Thu Nov 13 23:49:28 2025 From: duke at openjdk.org (Ruben) Date: Thu, 13 Nov 2025 23:49:28 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v2] In-Reply-To: References: Message-ID: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Ruben has updated the pull request incrementally with one additional commit since the last revision: Add an assertion to detect out of bounds access in post-call NOP checks ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28192/files - new: https://git.openjdk.org/jdk/pull/28192/files/7bb43523..20cc58a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28192&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28192&range=00-01 Stats: 71 lines in 18 files changed: 65 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28192.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28192/head:pull/28192 PR: https://git.openjdk.org/jdk/pull/28192 From duke at openjdk.org Thu Nov 13 23:49:29 2025 From: duke at openjdk.org (Ruben) Date: Thu, 13 Nov 2025 23:49:29 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v2] In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 10:50:13 GMT, Martin Doerr wrote: >> Ruben has updated the pull request incrementally with one additional commit since the last revision: >> >> Add an assertion to detect out of bounds access in post-call NOP checks > > src/hotspot/cpu/x86/nativeInst_x86.hpp line 585: > >> 583: }; >> 584: >> 585: bool check() const { return short_at(0) == 0x1f0f && short_at(2) == 0x0084; } > > Maybe a comment would be nice. Thanks for the suggestion. Would the comments added at https://github.com/openjdk/jdk/pull/28192/commits/20cc58a3649db0650da054809f64e0c4416d616f be suitable? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28192#discussion_r2525276626 From duke at openjdk.org Fri Nov 14 03:27:54 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 14 Nov 2025 03:27:54 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v11] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: address feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/83f63ce5..1583a684 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=09-10 Stats: 23 lines in 4 files changed: 14 ins; 4 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Fri Nov 14 03:36:45 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 14 Nov 2025 03:36:45 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v12] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: address feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/1583a684..70c5c644 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=10-11 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Fri Nov 14 03:49:48 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 14 Nov 2025 03:49:48 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v13] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: more cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/70c5c644..e09e98b6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=11-12 Stats: 5 lines in 3 files changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From vlivanov at openjdk.org Fri Nov 14 04:14:11 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 14 Nov 2025 04:14:11 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: <30fb4VWWtF8PPVn1ZTwIMZpmwt7ZB9jR2pHzSaj-e7s=.ed610e8b-0bb3-48a6-baf7-bcce09d5f274@github.com> On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages It would be clearer if ShenandoahGC-specific names explicitly refer to Shenandoah GC (`OptoRuntime::_shenandoah_load_reference_barrier_Type`, `make_shenandoah_load_reference_barrier_Type() `, `shenandoah_load_reference_barrier_Type()`). Otherwise, looks good. ------------- PR Review: https://git.openjdk.org/jdk/pull/27279#pullrequestreview-3462688619 From duke at openjdk.org Fri Nov 14 04:35:30 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 14 Nov 2025 04:35:30 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v14] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: more simplifications ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/e09e98b6..99023e9b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=12-13 Stats: 6 lines in 2 files changed: 0 ins; 5 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From duke at openjdk.org Fri Nov 14 04:38:43 2025 From: duke at openjdk.org (Nityanand Rai) Date: Fri, 14 Nov 2025 04:38:43 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v15] In-Reply-To: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: > Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27868/files - new: https://git.openjdk.org/jdk/pull/27868/files/99023e9b..572da29f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27868&range=13-14 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27868/head:pull/27868 PR: https://git.openjdk.org/jdk/pull/27868 From iklam at openjdk.org Fri Nov 14 04:48:16 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 14 Nov 2025 04:48:16 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: <9o8TjNxCbwshwMuqpS-CyyhwNciw1WQ6w1_ijy39DEc=.fe11d68c-f925-4c07-9c46-42c9093a5448@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> <9o8TjNxCbwshwMuqpS-CyyhwNciw1WQ6w1_ijy39DEc=.fe11d68c-f925-4c07-9c46-42c9093a5448@github.com> Message-ID: On Thu, 13 Nov 2025 17:01:45 GMT, Ioi Lam wrote: > > The filler-klass is not initialized when `preload_classes` is invoked, but `preload_classes` use heap-allocation, which may require filler-obj. > > @iklam What do you think? > > I am working on a fix now. The fix is quite simple. See https://github.com/openjdk/jdk/pull/28315 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3530816677 From kbarrett at openjdk.org Fri Nov 14 05:46:08 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 14 Nov 2025 05:46:08 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages > I have put guard on the shenandoah gc specific part of the code. It seems weird to me that a big pile of shenandoah-specific code is being moved into this otherwise GC-agnostic place. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27279#issuecomment-3530924386 From stefank at openjdk.org Fri Nov 14 06:44:14 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 14 Nov 2025 06:44:14 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v2] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 05:58:25 GMT, Kim Barrett wrote: >> 8369187: Add wrapper for that forbids use of global allocation and deallocation functions >> >> Please review this change that adds `cppstdlib/new.hpp` as a wrapper for >> including ``. All existing inclusions of `` are changed to include >> the new wrapper. >> >> In additional to including ``, this wrapper also provides deprecation >> declarations to prevent the use of some facilities by HotSpot code. >> >> However, those deprecations need to be conditionalized to not apply to gtests, >> so this change also adds a macro definition provided by the build system for >> use in detecting that a header is being included by a gtest. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into wrap-stdlib-new > - further conditionalize deprecation of hardare interference sizes > - add wrapper for Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28250#pullrequestreview-3463053286 From shade at openjdk.org Fri Nov 14 07:35:06 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 14 Nov 2025 07:35:06 GMT Subject: RFR: 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 14:59:41 GMT, Aleksey Shipilev wrote: > CTW tests are for compiler testing, so it makes sense to run them as part of hotspot_compiler group. There are no external dependencies for CTW that processes JDK-s own modules, so we can add that. Any comments? I think this is a right thing to do, given we catch fire in CTW testing every so often. @TobiHartmann, @eme64? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28268#issuecomment-3531316284 From wojciech.kudla at hsbc.com Fri Nov 14 08:28:35 2025 From: wojciech.kudla at hsbc.com (Wojciech KUDLA) Date: Fri, 14 Nov 2025 08:28:35 +0000 Subject: Potential OSR-related performance issue in jdk21 In-Reply-To: References: Message-ID: Hi, (Somehow this message didn't make it through to hotspot-compiler-dev, so trying here) We've observed a strange performance issue within our latency-sensitive application when running on jdk21 (tested with 21.0.3 and 21.0.8). The impacted threads are pinned to their respective isolated cores so not expected to experience any disruptions with the exception of a regular hrtick but that executes in user context and is always a sub-microsecond thing. First, we noticed that these threads started showing voluntary context switching. Since we avoid any non-vdso syscalls this was a strong indicator of some unintended locking going on and so we decided to capture user- and kernel-space stack traces for one such thread with a bit of eBPF. It looks like we go into this death loop of OSRs (the only type of compilation activity known to me to halt execution on the impacted thread) and this happens tens of times per second. Here's the stack traces: ustack: __lll_unlock_wake+26 CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 CompilationPolicy::compile(methodHandle const&, int, CompLevel, JavaThread*)+456 CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, CompiledMethod*, JavaThread*)+553 InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, unsigned char*)+331 InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned char*)+27 Interpreter+15968 void com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 kstack: syscall_trace_enter+686 syscall_trace_enter+686 do_syscall_64+326 entry_SYSCALL_64_after_hwframe+102 ustack: __lll_unlock_wake+26 CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 CompilationPolicy::compile(methodHandle const&, int, CompLevel, JavaThread*)+456 CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, CompiledMethod*, JavaThread*)+553 InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, unsigned char*)+331 InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned char*)+27 Interpreter+15968 void com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 kstack: syscall_slow_exit_work+179 syscall_slow_exit_work+179 do_syscall_64+365 entry_SYSCALL_64_after_hwframe+102 ustack: __lll_unlock_wake+26 CompilationPolicy::compile(methodHandle const&, int, CompLevel, JavaThread*)+456 CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, CompiledMethod*, JavaThread*)+553 InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, unsigned char*)+331 InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned char*)+27 Interpreter+15968 void com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 kstack: syscall_trace_enter+686 syscall_trace_enter+686 do_syscall_64+326 entry_SYSCALL_64_after_hwframe+102 ustack: __lll_unlock_wake+26 CompilationPolicy::compile(methodHandle const&, int, CompLevel, JavaThread*)+456 CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, CompiledMethod*, JavaThread*)+553 InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, unsigned char*)+331 InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned char*)+27 Interpreter+15968 void com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 kstack: syscall_slow_exit_work+179 syscall_slow_exit_work+179 do_syscall_64+365 entry_SYSCALL_64_after_hwframe+102 ustack: __lll_lock_wait+29 DirectivesStack::getMatchingDirective(methodHandle const&, AbstractCompiler*)+50 CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 CompilationPolicy::compile(methodHandle const&, int, CompLevel, JavaThread*)+456 CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, CompiledMethod*, JavaThread*)+553 InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, unsigned char*)+331 InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned char*)+27 Interpreter+15968 void com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 kstack: syscall_trace_enter+686 syscall_trace_enter+686 do_syscall_64+326 entry_SYSCALL_64_after_hwframe+102 ustack: __lll_lock_wait+29 DirectivesStack::getMatchingDirective(methodHandle const&, AbstractCompiler*)+50 CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 CompilationPolicy::compile(methodHandle const&, int, CompLevel, JavaThread*)+456 CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, CompiledMethod*, JavaThread*)+553 InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, unsigned char*)+331 InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned char*)+27 Interpreter+15968 void com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 kstack: syscall_slow_exit_work+179 syscall_slow_exit_work+179 do_syscall_64+365 entry_SYSCALL_64_after_hwframe+102 This usually starts happening a few minutes after we start the application, probably long enough to reach compilation thresholds. It can last for few to tens of minutes after which it might fix itself and all this activity disappears. This does not happen on any jdk17 version that we used. My gut feeling is some sort of a live lock somewhere in the profiler? We have limited means of reproducing it due to environment constraints but if you'd like us to run with some extra flags or on a fast debug build, we could arrange that. Also, I'm OpenJDK author with access to the bug tracker if you think we should create and issue for this. Thanks Wojciech KUDLA eFX eRisk Infrastructure HSBC Bank plc 8 Canada Square, London E14 5HQ Telephone: +44 (0)203 359 3827 Mobile: +44 7895 833 903 E-mail: wojciech.kudla at hsbc.com PUBLIC ______________________________________________________________________ -SAVE PAPER - THINK BEFORE YOU PRINT! This E-mail is confidential. It may also be legally privileged. If you are not the addressee you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return E-mail. Internet communications cannot be guaranteed to be timely secure, error or virus-free. The sender does not accept liability for any errors or omissions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dlong at openjdk.org Fri Nov 14 09:35:09 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Nov 2025 09:35:09 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages I agree with Kim. It seems cleaner to leave Shenandoah code in shenandoahBarrierSetC2.cpp. ------------- Changes requested by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27279#pullrequestreview-3463931863 From dlong at openjdk.org Fri Nov 14 09:45:24 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Nov 2025 09:45:24 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages src/hotspot/share/opto/runtime.cpp line 2413: > 2411: _dtrace_object_alloc_Type = make_dtrace_object_alloc_Type(); > 2412: _clone_type_Type = make_clone_type_Type(); > 2413: #if INCLUDE_SHENANDOAHGC A lot of the initializations in this function could be skipped based on runtime flags. Should we check `UseShenandoahGC` here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27279#discussion_r2526743350 From thartmann at openjdk.org Fri Nov 14 10:08:43 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 14 Nov 2025 10:08:43 GMT Subject: RFR: 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 14:59:41 GMT, Aleksey Shipilev wrote: > CTW tests are for compiler testing, so it makes sense to run them as part of hotspot_compiler group. There are no external dependencies for CTW that processes JDK-s own modules, so we can add that. Looks reasonable to me. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28268#pullrequestreview-3464115756 From epeter at openjdk.org Fri Nov 14 10:25:45 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 14 Nov 2025 10:25:45 GMT Subject: RFR: 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 14:59:41 GMT, Aleksey Shipilev wrote: > CTW tests are for compiler testing, so it makes sense to run them as part of hotspot_compiler group. There are no external dependencies for CTW that processes JDK-s own modules, so we can add that. Seems reasonable to me too. @TobiHartmann Just launched some internal tests, so please hold off with integration until we are sure those passed ;) ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28268#pullrequestreview-3464208653 From haosun at openjdk.org Fri Nov 14 10:57:31 2025 From: haosun at openjdk.org (Hao Sun) Date: Fri, 14 Nov 2025 10:57:31 GMT Subject: RFR: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family [v2] In-Reply-To: References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: On Tue, 11 Nov 2025 18:11:20 GMT, Dhamoder Nalla wrote: >> This PR makes two targeted AArch64 updates specific to Qualcomm silicon: >> >> 1. Corrects the CPU family enum name typo from CPU_QUALCOM to CPU_QUALCOMM. >> 2. Enables UseSHA3Intrinsics for Qualcomm (CPU_QUALCOMM) in addition to Apple (CPU_APPLE), allowing Qualcomm-based systems to use hardware-optimized SHA?3 implementations. >> >> Performance testing: >> The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs. >> >> >> > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> >> >> >> >> >> >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement >> -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- >> MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% >> MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% >> MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% >> MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% >> MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% >> MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% >> MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% >> MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66% >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> >> >> >> > > Dhamoder Nalla has updated the pull request incrementally with two additional commits since the last revision: > > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family > - [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family GHA tests are green. Let me sponsor this patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28166#issuecomment-3532175138 From dhanalla at openjdk.org Fri Nov 14 10:57:32 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Fri, 14 Nov 2025 10:57:32 GMT Subject: Integrated: 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family In-Reply-To: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> References: <9fGHvYHbn1M-_s63cxFAsz4EGPOnpEuFFKDbu7lwaNQ=.b882f8f4-17cd-415a-8fef-857860979c46@github.com> Message-ID: On Wed, 5 Nov 2025 21:56:50 GMT, Dhamoder Nalla wrote: > This PR makes two targeted AArch64 updates specific to Qualcomm silicon: > > 1. Corrects the CPU family enum name typo from CPU_QUALCOM to CPU_QUALCOMM. > 2. Enables UseSHA3Intrinsics for Qualcomm (CPU_QUALCOMM) in addition to Apple (CPU_APPLE), allowing Qualcomm-based systems to use hardware-optimized SHA?3 implementations. > > Performance testing: > The JMH test case MessageDigests.java is used to evaluate the performance improvements enabled by UseSHA3Intrinsics on Qualcomm CPUs. > > > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" > xmlns="http://www.w3.org/TR/REC-html40"> > > > > > > > > > > >
    > >
    > >
    > >
    > > Benchmark | (digesterName) | (length) | (provider) | Mode | Cnt | Score - Before change| Error | Score After change | Error | Units | SHA3 Perf Improvement > -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- > MessageDigests.digest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 4363.650 | ?682.413 | 5687.798 | ?855.826 | ops/ms | 30.34% > MessageDigests.digest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.794 | ?0.069 | 58.735 | ?0.077 | ops/ms | 28.26% > MessageDigests.digest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 4008.741 | ?703.879 | 5145.512 | ?866.479 | ops/ms | 28.36% > MessageDigests.digest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 23.991 | ?0.032 | 30.294 | ?0.040 | ops/ms | 26.27% > MessageDigests.getAndDigest | SHA3-256 | 64 | DEFAULT | thrpt | 15 | 1995.297 | ?396.007 | 2021.385 | ?486.581 | ops/ms | 1.31% > MessageDigests.getAndDigest | SHA3-256 | 16384 | DEFAULT | thrpt | 15 | 45.994 | ?0.051 | 58.283 | ?0.095 | ops/ms | 26.72% > MessageDigests.getAndDigest | SHA3-512 | 64 | DEFAULT | thrpt | 15 | 1889.550 | ?355.058 | 2173.164 | ?437.968 | ops/ms | 15.01% > MessageDigests.getAndDigest | SHA3-512 | 16384 | DEFAULT | thrpt | 15 | 24.411 | ?0.143 | 30.187 | ?0.035 | ops/ms | 23.66% > > > >
    > >
    > >
    > >
    > > > > > This pull request has now been integrated. Changeset: 00f2c38e Author: Dhamoder Nalla Committer: Hao Sun URL: https://git.openjdk.org/jdk/commit/00f2c38e373f5ae58ad6593cc7b9d53b9596eb17 Stats: 5 lines in 3 files changed: 2 ins; 0 del; 3 mod 8371161: [AArch64] Enable CPU feature UseSHA3Intrinsics for the Qualcomm processor family Reviewed-by: aph, haosun ------------- PR: https://git.openjdk.org/jdk/pull/28166 From duke at openjdk.org Fri Nov 14 11:11:43 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Fri, 14 Nov 2025 11:11:43 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v10] In-Reply-To: References: Message-ID: On Wed, 29 Oct 2025 18:08:31 GMT, Jonas Norlinder wrote: >> Hi all, >> >> This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. >> >> `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. >> >> FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. >> >> Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. > > Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: > > Fix phohensee review comments Thanks Kevin! > Just a question on whether we really need to specify the GCs in the tests. I think it is required for the shutdown test and we need to keep them all. However, we may omit this for the trivial API test. Would that be a reasonable compromise? I'd suggest to keep Epsilon and G1 for `TestGetTotalGcCpuTime`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27537#issuecomment-3532230611 From duke at openjdk.org Fri Nov 14 11:29:48 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Fri, 14 Nov 2025 11:29:48 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v11] In-Reply-To: References: Message-ID: > Hi all, > > This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. > > `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. > > FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. > > Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: Reduce GC coverage for trivial API test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27537/files - new: https://git.openjdk.org/jdk/pull/27537/files/44f5d864..8fd1ee09 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27537&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27537&range=09-10 Stats: 36 lines in 1 file changed: 0 ins; 36 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27537.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27537/head:pull/27537 PR: https://git.openjdk.org/jdk/pull/27537 From stuefe at openjdk.org Fri Nov 14 11:42:18 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 11:42:18 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v5] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: <3PgMYWEYdQEVJr2qVQ8vkiaIsBrG-qtcF63NPMS69Gk=.458b045e-91f0-4a31-9b9d-20c608ce28fe@github.com> On Tue, 11 Nov 2025 14:32:57 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: > > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - Better logging for -1 (cpu_shares) > - ... and 14 more: https://git.openjdk.org/jdk/compare/29100320...0958b10f New version looks good. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27743#pullrequestreview-3464545040 From stuefe at openjdk.org Fri Nov 14 11:42:20 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 11:42:20 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <5zLSizgJLZ-WPMqfgD2ox8fB76jYB6XJk1kUxh5BdXE=.e3ae7506-cbfb-49c1-9c76-622f7078218e@github.com> On Mon, 10 Nov 2025 13:27:59 GMT, Severin Gehwolf wrote: >> src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 80: >> >>> 78: return false; \ >>> 79: } \ >>> 80: log_trace(os, container)(log_string " is: " UINT64_FORMAT, retval); \ >> >> Here and in other places: don't use raw UINT64_FORMAT; use `PHYS_MEM_TYPE_FORMAT` instead. > > This is intentional since the processor_count API doesn't use `physical_memory_size_type` (as it doesn't make sense in this context). See, for example, `CgroupV2CpuController::cpu_period()`. The common denominator is `uint64_t`. This is a bit awkward, but I don't know a better way to deal with this. The reading functions are shared, most of the API is used for memory value reading (but not exclusively, exceptions are `pid`, `cpu`). Okay ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2527193808 From alanb at openjdk.org Fri Nov 14 12:05:51 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 14 Nov 2025 12:05:51 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v12] In-Reply-To: References: Message-ID: > Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). > > Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. > > HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). > > There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. > > Testing: tier1-6 Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 59 commits: - Merge branch 'master' into JDK-8353835 - Cleanup - More cleanup of Field.set API docs, including some restructure from Alex - Cleanup - Merge branch 'master' into JDK-8353835 - Update mutateFinals/modules test to exercise exports and opens cases - Update Field.set spec to better align with setAccessible for public final field in public class in exported package - Fix typo in java man page - Add method to test if package exported - Remove dup end body tag - ... and 49 more: https://git.openjdk.org/jdk/compare/9eaa364a...7693e8fa ------------- Changes: https://git.openjdk.org/jdk/pull/25115/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25115&range=11 Stats: 5347 lines in 76 files changed: 5152 ins; 55 del; 140 mod Patch: https://git.openjdk.org/jdk/pull/25115.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25115/head:pull/25115 PR: https://git.openjdk.org/jdk/pull/25115 From thartmann at openjdk.org Fri Nov 14 12:09:58 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 14 Nov 2025 12:09:58 GMT Subject: RFR: 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 14:59:41 GMT, Aleksey Shipilev wrote: > CTW tests are for compiler testing, so it makes sense to run them as part of hotspot_compiler group. There are no external dependencies for CTW that processes JDK-s own modules, so we can add that. All green. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28268#issuecomment-3532420188 From shade at openjdk.org Fri Nov 14 12:10:00 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 14 Nov 2025 12:10:00 GMT Subject: Integrated: 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 14:59:41 GMT, Aleksey Shipilev wrote: > CTW tests are for compiler testing, so it makes sense to run them as part of hotspot_compiler group. There are no external dependencies for CTW that processes JDK-s own modules, so we can add that. This pull request has now been integrated. Changeset: ff851de8 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/ff851de852673740542d922d1ee15a6c92b80473 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8371709: Add CTW to hotspot_compiler testing Reviewed-by: thartmann, epeter ------------- PR: https://git.openjdk.org/jdk/pull/28268 From shade at openjdk.org Fri Nov 14 12:09:59 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 14 Nov 2025 12:09:59 GMT Subject: RFR: 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 14:59:41 GMT, Aleksey Shipilev wrote: > CTW tests are for compiler testing, so it makes sense to run them as part of hotspot_compiler group. There are no external dependencies for CTW that processes JDK-s own modules, so we can add that. Thank you both! Here goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28268#issuecomment-3532425960 From kevinw at openjdk.org Fri Nov 14 12:11:11 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 14 Nov 2025 12:11:11 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v10] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 11:08:40 GMT, Jonas Norlinder wrote: > I think it is required for the shutdown test and we need to keep them all. However, we may omit this for the trivial API test. Would that be a reasonable compromise? I'd suggest to keep Epsilon and G1 for `TestGetTotalGcCpuTime`. OK that's great to cut it down a bit, thanks. I meant I think we need to remove all the specific GCs from that test, and let the test harness run with the variety of collectors. If test/jdk/java/lang/management/MemoryMXBean/TestGetTotalGcCpuTime.java specifies those two, doesn't it only run with those two? We want most code to be tested with a variety of collectors, we should get that if we just say nothing. 8-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/27537#issuecomment-3532441029 From aartemov at openjdk.org Fri Nov 14 12:19:18 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 14 Nov 2025 12:19:18 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v2] In-Reply-To: References: Message-ID: > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a functor by only one thread. > > Tested in tiers 1 - 5. Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into JDK-8366671-refactor-spin-acquire-spin-release - Merge remote-tracking branch 'origin/master' into JDK-8366671-refactor-spin-acquire-spin-release - 8366671: Fixed whitespaces. - 8366671: Fixed whitespaces. - 8366671: Fixed whitespaces. - 8366671: Refactor SpinAcquire/SpinRelease into SpinCriticalSection ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28264/files - new: https://git.openjdk.org/jdk/pull/28264/files/d7f6f345..23b9f1d6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=00-01 Stats: 4731 lines in 117 files changed: 3559 ins; 666 del; 506 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From alanb at openjdk.org Fri Nov 14 12:35:49 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 14 Nov 2025 12:35:49 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v12] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 12:05:51 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 59 commits: > > - Merge branch 'master' into JDK-8353835 > - Cleanup > - More cleanup of Field.set API docs, including some restructure from Alex > - Cleanup > - Merge branch 'master' into JDK-8353835 > - Update mutateFinals/modules test to exercise exports and opens cases > - Update Field.set spec to better align with setAccessible for public final field in public class in exported package > - Fix typo in java man page > - Add method to test if package exported > - Remove dup end body tag > - ... and 49 more: https://git.openjdk.org/jdk/compare/9eaa364a...7693e8fa Just to follow-up on this discussion. The checks done by Field.set on a final field need to the same as, or more restrictive, than the checks done by setAccessible, this is important to preserve traceability. A caller of setAccessible can suppress access checks on a public final in a public class in a package that is exported to at least the caller. So your observation that it is "surprising" to require the package be opened to the caller in order to mutate the field when it is final is a good observation. It's not wrong, it's just more draconian that it should be. I discussed with Alex and Ron and we agreed to adjust the spec for this. We will need to re-submit the CSR with the (small) update. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3532526777 From sgehwolf at openjdk.org Fri Nov 14 13:18:53 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 14 Nov 2025 13:18:53 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v5] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Tue, 11 Nov 2025 14:32:57 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: > > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - Better logging for -1 (cpu_shares) > - ... and 14 more: https://git.openjdk.org/jdk/compare/29100320...0958b10f Thanks for the reviews! If there are no objections to move forward with this I'll integrate Monday. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3532695447 From stuefe at openjdk.org Fri Nov 14 13:23:12 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 13:23:12 GMT Subject: RFR: 8371885: Mark UseCompressedClassPointers as obsolete for JDK 27 Message-ID: Trivial patch to move the obsoletion of `UseCompressedClassPointers` to jdk 27. Tested locally by adjusting SPECIAL_FLAG_VALIDATION_BUILD and running the `special_flags` gtest. ------------- Commit messages: - Start Changes: https://git.openjdk.org/jdk/pull/28322/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28322&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371885 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28322.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28322/head:pull/28322 PR: https://git.openjdk.org/jdk/pull/28322 From stuefe at openjdk.org Fri Nov 14 13:23:12 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 13:23:12 GMT Subject: RFR: 8371885: Mark UseCompressedClassPointers as obsolete for JDK 27 In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 12:52:07 GMT, Thomas Stuefe wrote: > Trivial patch to move the obsoletion of `UseCompressedClassPointers` to jdk 27. > > Tested locally by adjusting SPECIAL_FLAG_VALIDATION_BUILD and running the `special_flags` gtest. Ping @dholmes-ora ------------- PR Comment: https://git.openjdk.org/jdk/pull/28322#issuecomment-3532711475 From mdoerr at openjdk.org Fri Nov 14 13:44:46 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 14 Nov 2025 13:44:46 GMT Subject: RFR: 8371885: Mark UseCompressedClassPointers as obsolete for JDK 27 In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 12:52:07 GMT, Thomas Stuefe wrote: > Trivial patch to move the obsoletion of `UseCompressedClassPointers` to jdk 27. > > Tested locally by adjusting SPECIAL_FLAG_VALIDATION_BUILD and running the `special_flags` gtest. LGTM. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28322#pullrequestreview-3465005537 From kevinw at openjdk.org Fri Nov 14 14:52:16 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 14 Nov 2025 14:52:16 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v11] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 11:29:48 GMT, Jonas Norlinder wrote: >> Hi all, >> >> This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. >> >> `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. >> >> FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. >> >> Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. > > Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: > > Reduce GC coverage for trivial API test Or, if we don't want to remove the GC specifics and go with what the framework sets, then we should revert that last delete, sorry, otherwise with will just not be run with all GCs. I was thinking it was more efficient to say nothing in the test, but can leave this to be done with specifics in the test if you like. 8-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/27537#issuecomment-3533138317 From coleenp at openjdk.org Fri Nov 14 14:56:16 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 14 Nov 2025 14:56:16 GMT Subject: RFR: 8371885: Mark UseCompressedClassPointers as obsolete for JDK 27 In-Reply-To: References: Message-ID: <_hmQW08bDcIKwICvLr3DgcJYlPRN-IaIVG7qKoRG2GQ=.231c3222-d51c-49c0-aeee-0ab6f7e1dea2@github.com> On Fri, 14 Nov 2025 12:52:07 GMT, Thomas Stuefe wrote: > Trivial patch to move the obsoletion of `UseCompressedClassPointers` to jdk 27. > > Tested locally by adjusting SPECIAL_FLAG_VALIDATION_BUILD and running the `special_flags` gtest. Marked as reviewed by coleenp (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28322#pullrequestreview-3465314990 From stuefe at openjdk.org Fri Nov 14 14:56:17 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 14:56:17 GMT Subject: RFR: 8371885: Mark UseCompressedClassPointers as obsolete for JDK 27 In-Reply-To: <_hmQW08bDcIKwICvLr3DgcJYlPRN-IaIVG7qKoRG2GQ=.231c3222-d51c-49c0-aeee-0ab6f7e1dea2@github.com> References: <_hmQW08bDcIKwICvLr3DgcJYlPRN-IaIVG7qKoRG2GQ=.231c3222-d51c-49c0-aeee-0ab6f7e1dea2@github.com> Message-ID: <7eo1TPDBvmbiUC9-SJni6FQUUitMsnfOEtmQwGvan5w=.eed56182-c9c8-4eea-ada0-30bab85c2dd8@github.com> On Fri, 14 Nov 2025 14:51:57 GMT, Coleen Phillimore wrote: >> Trivial patch to move the obsoletion of `UseCompressedClassPointers` to jdk 27. >> >> Tested locally by adjusting SPECIAL_FLAG_VALIDATION_BUILD and running the `special_flags` gtest. > > Marked as reviewed by coleenp (Reviewer). Thanks @coleenp and @TheRealMDoerr ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28322#issuecomment-3533146805 From stuefe at openjdk.org Fri Nov 14 14:56:18 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 14:56:18 GMT Subject: Integrated: 8371885: Mark UseCompressedClassPointers as obsolete for JDK 27 In-Reply-To: References: Message-ID: <-HuaSLGnpsjls59We7Q1mtN94e9ktiyNbJRYWt-sBYQ=.7ecd80f5-2dbc-4559-a10a-b60c2e53b95a@github.com> On Fri, 14 Nov 2025 12:52:07 GMT, Thomas Stuefe wrote: > Trivial patch to move the obsoletion of `UseCompressedClassPointers` to jdk 27. > > Tested locally by adjusting SPECIAL_FLAG_VALIDATION_BUILD and running the `special_flags` gtest. This pull request has now been integrated. Changeset: 466cb383 Author: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/466cb383144edf0baa202dc5a2cac37e7572e2db Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8371885: Mark UseCompressedClassPointers as obsolete for JDK 27 Reviewed-by: mdoerr, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/28322 From aartemov at openjdk.org Fri Nov 14 15:05:56 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 14 Nov 2025 15:05:56 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v3] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 22:11:06 GMT, Coleen Phillimore wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Addressed reviewer's comments. > > src/hotspot/share/runtime/objectMonitor.hpp line 379: > >> 377: SetObjectStrongFunctor(OopHandle* object_strong, WeakHandle const* object); >> 378: void operator()(); >> 379: }; > > Does this need to be in the header file? I removed the functor completely, now a lambda function is used. > src/hotspot/share/runtime/park.cpp line 95: > >> 93: { >> 94: SpinCriticalSection scs(&ListLock); >> 95: { > > Do we need these extra {} here and at line 98? No, we don't need those. Removed. > src/hotspot/share/utilities/spinCriticalSection.cpp line 33: > >> 31: // short-duration critical sections where we're concerned >> 32: // about native mutex_t or HotSpot Mutex:: latency. >> 33: void SpinCriticalSectionHelper::SpinAcquire(volatile int* adr) { > > Now we can follow the coding style by calling this method spin_acquire. Adjusted in the latest commit. > src/hotspot/share/utilities/spinCriticalSection.cpp line 40: > >> 38: // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. >> 39: int ctr = 0; >> 40: int Yields = 0; > > lower case y. Adjusted in the latest commit. > src/hotspot/share/utilities/spinCriticalSection.hpp line 37: > >> 35: static void SpinAcquire(volatile int* Lock); >> 36: static void SpinRelease(volatile int* Lock); >> 37: static bool TrySpinAcquire(volatile int* Lock); > > All these names can now follow the coding style (snake case). Adjusted in the latest commit. > src/hotspot/share/utilities/spinCriticalSection.hpp line 56: > >> 54: // A short section which is to be executed by only one thread. >> 55: // The payload code is to be put into an object inherited from the Functor class. >> 56: class SpinSingleSection { > > Could this instead be a template instead and the code can pass a lambda to it rather than this Functor type? It can be done with a template and a lambda function. I changed the implementation in the latest commit. It does not look any better, however. > test/hotspot/gtest/jfr/test_adaptiveSampler.cpp line 43: > >> 41: #include "runtime/atomicAccess.hpp" >> 42: #include "utilities/globalDefinitions.hpp" >> 43: #include "utilities/spinCriticalSection.hpp" > > Why this include? I used to have a different layout, where it was necessary, not it is not. Removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527821629 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527816701 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527817406 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527817745 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527819831 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527820396 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527819486 From aartemov at openjdk.org Fri Nov 14 15:05:51 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 14 Nov 2025 15:05:51 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v3] In-Reply-To: References: Message-ID: > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a functor by only one thread. > > Tested in tiers 1 - 5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8366671: Addressed reviewer's comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28264/files - new: https://git.openjdk.org/jdk/pull/28264/files/23b9f1d6..08e03075 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=01-02 Stats: 616 lines in 7 files changed: 572 ins; 24 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From aartemov at openjdk.org Fri Nov 14 15:08:14 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 14 Nov 2025 15:08:14 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v3] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 22:11:12 GMT, Coleen Phillimore wrote: > I have some initial comments and I haven't really figured out why we want the SpinSingleSection version vs. the CAS that is there now. The CAS makes a lot more sense to me. The main purpose is unification, CAS creates a section to be executed with only 1 thread, whereas a critical section is executed 1-by-1 thread. I tried to make it look very similarly. However, I agree that simple CAS is way more readable. I did not find any other places where a single section would be required. So maybe it is just overengineering. If there is no other use-cases, I am ok to remove it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28264#issuecomment-3533194228 From sgehwolf at openjdk.org Fri Nov 14 15:11:02 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 14 Nov 2025 15:11:02 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v6] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: - Merge branch 'master' into jdk-8365606-jlong-julong-refactor - Add space in trace log - Merge branch 'master' into jdk-8365606-jlong-julong-refactor - One more comment fix - Extract OSContainer::available_swap_in_bytes() - Simplify os::used_memory() - Fix os::active_processor_count() - os::free_memory => use 'value' directly - os::available_memory() => use 'value' directly - Fix pids_max printing in VM.info - ... and 15 more: https://git.openjdk.org/jdk/compare/5d65c23c...9a5f3eb5 ------------- Changes: https://git.openjdk.org/jdk/pull/27743/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=05 Stats: 1308 lines in 16 files changed: 514 ins; 106 del; 688 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From cnorrbin at openjdk.org Fri Nov 14 15:17:39 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Fri, 14 Nov 2025 15:17:39 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v6] In-Reply-To: <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> Message-ID: On Fri, 14 Nov 2025 15:11:02 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: > > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - ... and 15 more: https://git.openjdk.org/jdk/compare/5d65c23c...9a5f3eb5 Marked as reviewed by cnorrbin (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27743#pullrequestreview-3465404828 From aartemov at openjdk.org Fri Nov 14 15:44:56 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 14 Nov 2025 15:44:56 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 15:00:58 GMT, Anton Artemov wrote: >> test/hotspot/gtest/jfr/test_adaptiveSampler.cpp line 43: >> >>> 41: #include "runtime/atomicAccess.hpp" >>> 42: #include "utilities/globalDefinitions.hpp" >>> 43: #include "utilities/spinCriticalSection.hpp" >> >> Why this include? > > I used to have a different layout, where it was necessary, not it is not. Removed. UPD: Apparently this include is needed. Without it, some weird build errors show up. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2527939125 From aartemov at openjdk.org Fri Nov 14 15:44:53 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 14 Nov 2025 15:44:53 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a functor by only one thread. > > Tested in tiers 1 - 5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8366671: Fixed build problem. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28264/files - new: https://git.openjdk.org/jdk/pull/28264/files/08e03075..0e78affd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=02-03 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From epeter at openjdk.org Fri Nov 14 16:07:06 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 14 Nov 2025 16:07:06 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization In-Reply-To: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: On Thu, 13 Nov 2025 21:34:30 GMT, Hamlin Li wrote: > Hi, > > This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. > > This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. > > Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. > > # Test > ## Jtreg > > in progress... > > ## Performance > > Column names meanings: > * p: with patch > * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > * m: without patch > * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > > #### Average improvement > > NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. > > For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. > > Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) > -- | -- | -- | -- > 1.022782609 | 2.198717391 | 2.162673913 | 2.199 > > I won't be able to review the RISCV part, so you'll have to find someone else there. I just dropped 2 drive-by comments about the tests :) test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMove.java line 36: > 34: * @test > 35: * @summary Test conditional move. > 36: * @requires vm.simpleArch == "riscv64" I would prefer if you could enable the test on all platforms, but just require the specific platform on the IR rules. What would be even more fantastic: if you were able to also enable the IR rules for `x64` and `aarch64`, but we can also file a follow-up RFE for that. test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMove.java line 49: > 47: "-XX:+UnlockExperimentalVMOptions", "-XX:-UseCompactObjectHeaders"); > 48: TestFramework.runWithFlags("-XX:+UseCMoveUnconditionally", "-XX:-UseVectorCmov", > 49: "-XX:+UnlockExperimentalVMOptions", "-XX:+UseCompactObjectHeaders"); Wait. Is this just a copy of the existing vector test, but run with CMove vectorization disabled? If so, we could just add these additional runs to the existing test, and guard the IR test with corresponding flags: Have an IR rule for `-XX:-UseVectorCmov` and one for `-XX:+UseVectorCmov`. That would allow us to reduce some code duplication. And it would also avoid letting the two tests go out of sync when people add more to one but not the other. What do you think? ------------- PR Review: https://git.openjdk.org/jdk/pull/28309#pullrequestreview-3465590460 PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2528003621 PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2528011154 From kurt at openjdk.org Fri Nov 14 17:53:22 2025 From: kurt at openjdk.org (Kurt Miller) Date: Fri, 14 Nov 2025 17:53:22 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 17:41:56 GMT, Kurt Miller wrote: > ?rGenerator::generate_native_entry > > I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: > > > // get native function entry point in r10 > { > Label L; > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); > __ lea(rscratch2, unsatisfied); > __ ldr(rscratch2, rscratch2); > __ cmp(r10, rscratch2); > __ br(Assembler::NE, L); > __ call_VM(noreg, > CAST_FROM_FN_PTR(address, > InterpreterRuntime::prepare_native_call), > rmethod); > __ get_method(rmethod); > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > __ bind(L); > } > > > If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. > > This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. > > This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. > > Updated comment with markdown for code. Here is a gdb session on OpenBSD/aarch64 without this patch showing the SEGFAULT where the `ldr(rscratch2, rscratch2)` instruction is attempting to load the initial instructions (`paciasp`) for `throw_unsatisfied_link_error` into `x9`: Core was generated by `javac'. Program terminated with signal SIGABRT, Aborted. #0 thrkill () at /tmp/-:3 warning: 3 /tmp/-: No such file or directory [Current thread is 1 (process 121574)] (gdb) bt #0 thrkill () at /tmp/-:3 #1 0x000000190c19873c in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51 #2 0x00000019249f1470 [PAC] in os::abort (dump_core=true, siginfo=, context=) at /home/truk/jdk/jdk/src/hotspot/os/posix/os_posix.cpp:2226 #3 0x0000001924f43660 [PAC] in VMError::report_and_die (id=id at entry=11, message=, message at entry=0x1911d160d0 "", detail_fmt=, detail_args=..., thread=, pc=0x19c269d9b8 ")\001@\371_\001\t\353A\004", siginfo=0x1911d16648, context=0x1911d16528, filename=0x0, lineno=0, size=0) at /home/truk/jdk/jdk/src/hotspot/share/utilities/vmError.cpp:1992 #4 0x0000001924f42df0 [PAC] in VMError::report_and_die (thread=0x1931a44800, sig=0, sig at entry=11, pc=0x0, siginfo=0x65, context=0x31, detail_fmt=0x0) at /home/truk/jdk/jdk/src/hotspot/share/utilities/vmError.cpp:1631 #5 0x0000001924f43d50 [PAC] in VMError::report_and_die (thread=, sig=11, pc=, siginfo=, context=) at /home/truk/jdk/jdk/src/hotspot/share/utilities/vmError.cpp:1651 #6 0x0000001924d0e098 [PAC] in JVM_handle_bsd_signal (sig=11, info=0x1911d16648, ucVoid=0x1911d16528, abort_if_unrecognized=1) at /home/truk/jdk/jdk/src/hotspot/os/posix/signals_posix.cpp:652 #7 #8 0x00000019c269d9b8 in ?? () #9 0x00000019c269ae70 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) frame 8 #8 0x00000019c269d9b8 in ?? () (gdb) x/i $pc => 0x19c269d9b8: ldr x9, [x9] (gdb) x/i $x9 0x1924b3fc08 : paciasp (gdb) x/x $x9 0x1924b3fc08 : 0xd503233f (gdb) x/i $x10 0x192aaad88c : bti c ------------- PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3533895412 From mli at openjdk.org Fri Nov 14 18:03:53 2025 From: mli at openjdk.org (Hamlin Li) Date: Fri, 14 Nov 2025 18:03:53 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> > Hi, > > This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. > > This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. > > Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. > > # Test > ## Jtreg > > in progress... > > ## Performance > > Column names meanings: > * p: with patch > * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > * m: without patch > * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > > #### Average improvement > > NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. > > For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. > > Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) > -- | -- | -- | -- > 1.022782609 | 2.198717391 | 2.162673913 | 2.199 > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: - add CMove+CmpP/N tests - fix cmovF/D_cmpP ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28309/files - new: https://git.openjdk.org/jdk/pull/28309/files/ec0d8cc4..5c0d645d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=00-01 Stats: 359 lines in 2 files changed: 357 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28309/head:pull/28309 PR: https://git.openjdk.org/jdk/pull/28309 From kurt at openjdk.org Fri Nov 14 17:53:21 2025 From: kurt at openjdk.org (Kurt Miller) Date: Fri, 14 Nov 2025 17:53:21 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry Message-ID: ?rGenerator::generate_native_entry I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: // get native function entry point in r10 { Label L; __ ldr(r10, Address(rmethod, Method::native_function_offset())); ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); __ lea(rscratch2, unsatisfied); __ ldr(rscratch2, rscratch2); __ cmp(r10, rscratch2); __ br(Assembler::NE, L); __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::prepare_native_call), rmethod); __ get_method(rmethod); __ ldr(r10, Address(rmethod, Method::native_function_offset())); __ bind(L); } If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. Updated comment with markdown for code. ------------- Commit messages: - 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry Changes: https://git.openjdk.org/jdk/pull/28327/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28327&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371918 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28327.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28327/head:pull/28327 PR: https://git.openjdk.org/jdk/pull/28327 From mli at openjdk.org Fri Nov 14 18:15:07 2025 From: mli at openjdk.org (Hamlin Li) Date: Fri, 14 Nov 2025 18:15:07 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: <5PzMJntiu2waMvciTLvXaUH15Fm3dXZPsDVvkuqWPI0=.68c6456a-e5d3-413e-bef8-d8da95de40bd@github.com> On Fri, 14 Nov 2025 15:59:18 GMT, Emanuel Peter wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - add CMove+CmpP/N tests >> - fix cmovF/D_cmpP > > test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMove.java line 36: > >> 34: * @test >> 35: * @summary Test conditional move. >> 36: * @requires vm.simpleArch == "riscv64" > > I would prefer if you could enable the test on all platforms, but just require the specific platform on the IR rules. > What would be even more fantastic: if you were able to also enable the IR rules for `x64` and `aarch64`, but we can also file a follow-up RFE for that. Make sense. I filed https://bugs.openjdk.org/browse/JDK-8371920 to track the task, will do it later after this pr. > test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMove.java line 49: > >> 47: "-XX:+UnlockExperimentalVMOptions", "-XX:-UseCompactObjectHeaders"); >> 48: TestFramework.runWithFlags("-XX:+UseCMoveUnconditionally", "-XX:-UseVectorCmov", >> 49: "-XX:+UnlockExperimentalVMOptions", "-XX:+UseCompactObjectHeaders"); > > Wait. Is this just a copy of the existing vector test, but run with CMove vectorization disabled? > If so, we could just add these additional runs to the existing test, and guard the IR test with corresponding flags: > Have an IR rule for `-XX:-UseVectorCmov` and one for `-XX:+UseVectorCmov`. > > That would allow us to reduce some code duplication. And it would also avoid letting the two tests go out of sync when people add more to one but not the other. > > What do you think? Good idea! I can do it. What do you think about the name of the merged tests? `TestConditionalMove.java` or `TestScalarAndVectorConditionalMove.java` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2528463608 PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2528467634 From kbarrett at openjdk.org Fri Nov 14 18:42:02 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 14 Nov 2025 18:42:02 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic Message-ID: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Please review this change to the `LockFreeStack` utility to allow clients to use `Atomic` as the type of the "next" member used in the linked-list representation of the stack. It also continues to allow clients to use the old (pre-`Atomic`) form where the "next" member is volatile. This allows clients to be updated incrementally after this change, rather than requiring all clients to be updated in conjunction with the update of this class. Once all clients have been updated, support for the old form can be removed. The associated gtests have been updated to use `Atomic`, with testing of the old form is no longer being done. The non-updated uses provide some testing, and that's all expected to go away soon. So parameterizing the gtests for both forms seems like a bunch of work that will just be deleted soon, with very little benefit. Testing: mach5 tier1 ------------- Commit messages: - LockFreeStack supports Atomic Changes: https://git.openjdk.org/jdk/pull/28329/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28329&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371923 Stats: 53 lines in 2 files changed: 21 ins; 0 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/28329.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28329/head:pull/28329 PR: https://git.openjdk.org/jdk/pull/28329 From kbarrett at openjdk.org Fri Nov 14 18:50:49 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 14 Nov 2025 18:50:49 GMT Subject: RFR: 8371922: Remove unused NonblockingQueue class Message-ID: Please review this trivial change that removes the unused NonblockingQueue utility class and its associated gtests. Testing: mach5 tier1 ------------- Commit messages: - remove NonblockingQueue Changes: https://git.openjdk.org/jdk/pull/28330/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28330&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371922 Stats: 667 lines in 3 files changed: 0 ins; 667 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28330.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28330/head:pull/28330 PR: https://git.openjdk.org/jdk/pull/28330 From coleenp at openjdk.org Fri Nov 14 18:56:07 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 14 Nov 2025 18:56:07 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 15:01:17 GMT, Anton Artemov wrote: >> src/hotspot/share/utilities/spinCriticalSection.hpp line 56: >> >>> 54: // A short section which is to be executed by only one thread. >>> 55: // The payload code is to be put into an object inherited from the Functor class. >>> 56: class SpinSingleSection { >> >> Could this instead be a template instead and the code can pass a lambda to it rather than this Functor type? > > It can be done with a template and a lambda function. I changed the implementation in the latest commit. It does not look any better, however. Actually it doesn't really look better with a lambda. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2528570773 From coleenp at openjdk.org Fri Nov 14 18:56:05 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 14 Nov 2025 18:56:05 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 15:44:53 GMT, Anton Artemov wrote: >> Hi, >> >> please consider the following changes: >> >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. >> >> Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. >> >> Tested in tiers 1 - 5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8366671: Fixed build problem. I think the SpinSingleSection might be a bit of over-engineering for something that's done a lot in the code base. Let's see what @pchilano thinks. patch.txt line 3: > 1: diff --git a/src/hotspot/share/runtime/objectMonitor.cpp b/src/hotspot/share/runtime/objectMonitor.cpp > 2: index ee7629ec6f5..b1c806308ff 100644 > 3: --- a/src/hotspot/share/runtime/objectMonitor.cpp I don't think this was supposed to be added. src/hotspot/share/runtime/objectMonitor.cpp line 320: > 318: check_object_context(); > 319: if (_object_strong.is_empty()) { > 320: auto setObjectStrongLambda = [&](OopHandle& object_strong, const WeakHandle& object) { Doesn't the lambda capture the _object and _object_strong values from the [&] ? And maybe instead of a class SpinSingleSection it should be a template function that passes in the lambda? src/hotspot/share/runtime/objectMonitor.hpp line 36: > 34: #include "utilities/checkedCast.hpp" > 35: #include "utilities/globalDefinitions.hpp" > 36: #include "utilities/spinCriticalSection.hpp" Include shouldn't be needed here. test/hotspot/gtest/jfr/test_adaptiveSampler.cpp line 43: > 41: #include "runtime/atomicAccess.hpp" > 42: #include "utilities/globalDefinitions.hpp" > 43: #include "utilities/spinCriticalSection.hpp" Why is this include needed here? ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3466310630 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2528558216 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2528580597 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2528566456 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2528572083 From iklam at openjdk.org Fri Nov 14 19:13:04 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 14 Nov 2025 19:13:04 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: <09Z7-ZZkmzO2T0nkSl3czZlkhJPzun79PkiNngWrQcU=.472c6dd5-5f54-42f2-ae16-62cc74c197fe@github.com> On Thu, 13 Nov 2025 11:35:04 GMT, Albert Mingkun Yang wrote: >> We have seen crashes on many platforms (including x64) while running `make run-test TEST=runtime/cds/appcds/aotClassLinking/LambdaInExcludedClass.java JTREG="VM_OPTIONS=-XX:+UseCompactObjectHeaders"`: >> >> SIGSEGV (0xb) at pc=0x00007f2f95a61e7a, pid=18554, tid=18557 >> V [libjvm.so+0x15bfe7a] MemAllocator::finish(HeapWordImpl**) const+0xca (klass.inline.hpp:72) >> V [libjvm.so+0x15c029f] ObjAllocator::initialize(HeapWordImpl**) const+0x2f (memAllocator.cpp:391) >> V [libjvm.so+0xb0630b] CollectedHeap::fill_with_object(HeapWordImpl**, unsigned long, bool)+0x27b (collectedHeap.cpp:491) >> V [libjvm.so+0x1c7a0bb] ThreadLocalAllocBuffer::retire(ThreadLocalAllocStats*)+0x11b (threadLocalAllocBuffer.cpp:118) >> V [libjvm.so+0x15c0b14] MemAllocator::mem_allocate_inside_tlab_slow(MemAllocator::Allocation&) const+0x84 (memAllocator.cpp:286) >> V [libjvm.so+0x15c13ab] MemAllocator::mem_allocate(MemAllocator::Allocation&) const+0xbb (memAllocator.cpp:340) >> V [libjvm.so+0x15c14f9] MemAllocator::allocate() const+0xa9 (memAllocator.cpp:353) >> V [libjvm.so+0x1cc052e] TypeArrayKlass::allocate_common(int, bool, JavaThread*)+0x13e (collectedHeap.inline.hpp:41) >> V [libjvm.so+0x16fbc98] oopFactory::new_typeArray(BasicType, int, JavaThread*)+0x38 (typeArrayKlass.hpp:51) >> V [libjvm.so+0x106b0f3] java_lang_Class::restore_archived_mirror(Klass*, Handle, Handle, Handle, JavaThread*)+0x413 (javaClasses.cpp:1246) >> V [libjvm.so+0x14100bc] Klass::restore_unshareable_info(ClassLoaderData*, Handle, JavaThread*)+0x66c (klass.cpp:903) >> V [libjvm.so+0xfe2cb1] InstanceKlass::restore_unshareable_info(ClassLoaderData*, Handle, PackageEntry*, JavaThread*)+0x81 (instanceKlass.cpp:2823) >> V [libjvm.so+0x1c0f5ad] SystemDictionary::preload_class(Handle, InstanceKlass*, JavaThread*)+0x1ed (systemDictionary.cpp:1198) >> V [libjvm.so+0x676e83] AOTLinkedClassBulkLoader::preload_classes_in_table(Array*, char const*, Handle, JavaThread*)+0x1a3 (aotLinkedClassBulkLoader.cpp:103) >> V [libjvm.so+0x679af5] AOTLinkedClassBulkLoader::preload_classes_impl(JavaThread*)+0x165 (aotLinkedClassBulkLoader.cpp:76) >> V [libjvm.so+0x67c371] AOTLinkedClassBulkLoader::preload_classes(JavaThread*)+0x11 (aotLinkedClassBulkLoader.cpp:61) >> V [libjvm.so+0x1d5bf30] vmClasses::resolve_all(JavaThread*)+0x3e0 (vmClasses.cpp:126) >> V [libjvm.so+0x1c0f28c] SystemDictionary::initialize(JavaThread*)+0x10c (systemDictionary.cpp:1623) >> V [libjvm.so+0x1cc74ca] Uni... > >> ... make run-test TEST=runtime/cds/appcds/aotClassLinking/LambdaInExcludedClass.java JTREG="VM_OPTIONS=-XX:+UseCompactObjectHeaders" > > I suspect the crash is caused by a preexisting issue that is exposed by this patch. > > In `vmClasses::resolve_all`: > > #if INCLUDE_CDS > if (CDSConfig::is_using_aot_linked_classes()) { > AOTLinkedClassBulkLoader::preload_classes(THREAD); > } > #endif > > // Preload commonly used klasses > vmClassID scan = vmClassID::FIRST; > // first do Object, then String, Class > resolve_through(VM_CLASS_ID(Object_klass), scan, CHECK); > CollectedHeap::set_filler_object_klass(vmClasses::Object_klass()); > > > The filler-klass is not initialized when `preload_classes` is invoked, but `preload_classes` use heap-allocation, which may require filler-obj. > > @iklam What do you think? @albertnetymk I've pushed https://github.com/openjdk/jdk/pull/28315. Please verify if it fixes the crash before integrating this PR. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3534185759 From ayang at openjdk.org Fri Nov 14 19:22:39 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 14 Nov 2025 19:22:39 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v3] In-Reply-To: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: > Trivial removing obsoleted code for unsupported arch. > > Test: tier1 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into remove-tlab-reserve - review - remove-tlab-reserve ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28240/files - new: https://git.openjdk.org/jdk/pull/28240/files/0e447848..d6c34da7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28240&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28240&range=01-02 Stats: 10141 lines in 159 files changed: 5771 ins; 3554 del; 816 mod Patch: https://git.openjdk.org/jdk/pull/28240.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28240/head:pull/28240 PR: https://git.openjdk.org/jdk/pull/28240 From amenkov at openjdk.org Fri Nov 14 19:42:21 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 14 Nov 2025 19:42:21 GMT Subject: Integrated: 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS In-Reply-To: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> References: <6rHIf2ptpmqaSNTLtKNv6bZSQCENxBpxnk3i193hOHE=.dbc266c3-e9b5-4c27-ac5a-5d9ca399ea87@github.com> Message-ID: On Mon, 10 Nov 2025 20:54:56 GMT, Alex Menkov wrote: > FollowReferences with null initial_object starts heap walking from "heap roots", which include system classes. > All oops from ClassLoaderDataGraph are reported with JVMTI_HEAP_REFERENCE_SYSTEM_CLASS kind, but some of the objects are not classes. > The fix updates FollowReferences to report non-class objects from ClassLoaderDataGraph as JVMTI_HEAP_REFERENCE_OTHER > > Testing: tier1..4,hs-tier5-svc This pull request has now been integrated. Changeset: 3924a28a Author: Alex Menkov URL: https://git.openjdk.org/jdk/commit/3924a28a2281bbdb13fe9f1e0b5347d57197f8dc Stats: 209 lines in 3 files changed: 206 ins; 0 del; 3 mod 8371083: FollowReferences reports non-class objects as JVMTI_HEAP_REFERENCE_SYSTEM_CLASS Reviewed-by: lmesnik, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/28224 From sviswanathan at openjdk.org Fri Nov 14 19:51:13 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 14 Nov 2025 19:51:13 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 16:38:49 GMT, Volodymyr Paprotski wrote: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" src/hotspot/cpu/x86/assembler_x86.cpp line 3865: > 3863: void Assembler::vmovsldup(XMMRegister dst, XMMRegister src, int vector_len) { > 3864: assert(vector_len == AVX_128bit ? VM_Version::supports_avx() : > 3865: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : Vector length 256 bit is supported by AVX=1. src/hotspot/cpu/x86/assembler_x86.cpp line 3874: > 3872: void Assembler::vmovshdup(XMMRegister dst, XMMRegister src, int vector_len) { > 3873: assert(vector_len == AVX_128bit ? VM_Version::supports_avx() : > 3874: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : Vector length 256 bit is supported by AVX=1. src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 83: > 81: // size 0 and 1 are used for initial and final shuffles respectivelly of > 82: // dilithiumAlmostInverseNtt and dilithiumAlmostNtt. > 83: // NOTE: For size 0 and 1, input1[] and input2[] are modified in-place what is the size-in-bits when size is 0 and 1? What is the difference between size 0 and size1? The overloading of size makes it confusing. src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 137: > 135: for (int i = 0; i < regCnt; i++) { > 136: // 0b-1-2-3-1 > 137: __ vshufps(output2[i], input1[i], input2[i], 0b11011101, vector_len); Did you mean this to be //0b-1-3-1-3? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2528279719 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2528288894 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2528416321 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2528610634 From sviswanathan at openjdk.org Fri Nov 14 19:51:14 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 14 Nov 2025 19:51:14 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 17:51:24 GMT, Sandhya Viswanathan wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 83: > >> 81: // size 0 and 1 are used for initial and final shuffles respectivelly of >> 82: // dilithiumAlmostInverseNtt and dilithiumAlmostNtt. >> 83: // NOTE: For size 0 and 1, input1[] and input2[] are modified in-place > > what is the size-in-bits when size is 0 and 1? What is the difference between size 0 and size1? The overloading of size makes it confusing. size 0 seems to be doing a different shuffle than what is described in the diagram. > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 137: > >> 135: for (int i = 0; i < regCnt; i++) { >> 136: // 0b-1-2-3-1 >> 137: __ vshufps(output2[i], input1[i], input2[i], 0b11011101, vector_len); > > Did you mean this to be //0b-1-3-1-3? or 3-1-3-1. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2528747938 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2528753295 From coleenp at openjdk.org Fri Nov 14 20:01:01 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 14 Nov 2025 20:01:01 GMT Subject: RFR: 8371922: Remove unused NonblockingQueue class In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 18:43:04 GMT, Kim Barrett wrote: > Please review this trivial change that removes the unused NonblockingQueue > utility class and its associated gtests. > > Testing: mach5 tier1 Thank you. I tried to use this once because it sounded cool but it didn't solve my problem iirc. It is a trivial change. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28330#pullrequestreview-3466602949 PR Comment: https://git.openjdk.org/jdk/pull/28330#issuecomment-3534353318 From kbarrett at openjdk.org Fri Nov 14 20:36:12 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 14 Nov 2025 20:36:12 GMT Subject: RFR: 8371922: Remove unused NonblockingQueue class In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 19:56:45 GMT, Coleen Phillimore wrote: > Thank you. I tried to use this once because it sounded cool but it didn't solve my problem iirc. I kind of remember that. Thanks for reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28330#issuecomment-3534461098 From kbarrett at openjdk.org Fri Nov 14 20:36:14 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 14 Nov 2025 20:36:14 GMT Subject: Integrated: 8371922: Remove unused NonblockingQueue class In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 18:43:04 GMT, Kim Barrett wrote: > Please review this trivial change that removes the unused NonblockingQueue > utility class and its associated gtests. > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: 91b97a49 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/91b97a49d48ee8528b34486172293fd3a68ae3c7 Stats: 667 lines in 3 files changed: 0 ins; 667 del; 0 mod 8371922: Remove unused NonblockingQueue class Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/jdk/pull/28330 From duke at openjdk.org Fri Nov 14 21:21:16 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Fri, 14 Nov 2025 21:21:16 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v12] In-Reply-To: References: Message-ID: > Hi all, > > This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. > > `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. > > FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. > > Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: Revert "Reduce GC coverage for trivial API test" This reverts commit 8fd1ee093066138c9aa5602dcac0e7db1916db6b. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27537/files - new: https://git.openjdk.org/jdk/pull/27537/files/8fd1ee09..97978d04 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27537&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27537&range=10-11 Stats: 36 lines in 1 file changed: 36 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/27537.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27537/head:pull/27537 PR: https://git.openjdk.org/jdk/pull/27537 From duke at openjdk.org Fri Nov 14 21:23:22 2025 From: duke at openjdk.org (duke) Date: Fri, 14 Nov 2025 21:23:22 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v11] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 11:29:48 GMT, Jonas Norlinder wrote: >> Hi all, >> >> This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. >> >> `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. >> >> FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. >> >> Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. > > Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: > > Reduce GC coverage for trivial API test @JonasNorlinder Your change (at version 97978d04c241061cd0fdf656748bfe08da88017a) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27537#issuecomment-3534600306 From dlong at openjdk.org Sat Nov 15 02:36:02 2025 From: dlong at openjdk.org (Dean Long) Date: Sat, 15 Nov 2025 02:36:02 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 17:41:56 GMT, Kurt Miller wrote: > ?rGenerator::generate_native_entry > > I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: > > > // get native function entry point in r10 > { > Label L; > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); > __ lea(rscratch2, unsatisfied); > __ ldr(rscratch2, rscratch2); > __ cmp(r10, rscratch2); > __ br(Assembler::NE, L); > __ call_VM(noreg, > CAST_FROM_FN_PTR(address, > InterpreterRuntime::prepare_native_call), > rmethod); > __ get_method(rmethod); > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > __ bind(L); > } > > > If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. > > This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. > > This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. > > Updated comment with markdown for code. It looks like RISCV is broken in the same way. According to InterpreterRuntime::prepare_native_call(), if there is a signal handler, which is checked first, then there should be a native function. So I wonder if we can remove the check for the native function from all CPU ports. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3535430902 PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3535434036 From dean.long at oracle.com Sat Nov 15 02:54:01 2025 From: dean.long at oracle.com (Dean Long) Date: Fri, 14 Nov 2025 18:54:01 -0800 Subject: Potential OSR-related performance issue in jdk21 In-Reply-To: References: Message-ID: <3bbccd70-41f8-498b-a62b-0ade71464ac7@oracle.com> It wouldn't hurt to file a bug.? If the same method keeps getting compiled over and over then it would be good to find out why. Some flags that might help: -XX:+PrintCompilation -XX:+LogCompilation and also maybe -XX:+TraceDeoptimization.? It could be OSR if DispatchThread.run()+788 is a branch, or just a regular compile if it is a method call. dl On 11/14/25 12:28 AM, Wojciech KUDLA wrote: > > Hi, > > (Somehow this message didn?t make it through to hotspot-compiler-dev, > so trying here) > > > We?ve observed a strange performance issue within our > latency-sensitive application when running on jdk21 (tested with > 21.0.3 and 21.0.8). > The impacted threads are pinned to their respective isolated cores so > not expected to experience any disruptions with the exception of a > regular hrtick but that executes in user context and is always a > sub-microsecond thing. > First, we noticed that these threads started showing voluntary context > switching. Since we avoid any non-vdso syscalls this was a strong > indicator of some unintended locking going on and so we decided to > capture user- and kernel-space stack traces for one such thread with a > bit of eBPF. > It looks like we go into this death loop of OSRs (the only type of > compilation activity known to me to halt execution on the impacted > thread) and this happens tens of times per second. > Here?s the stack traces: > > ustack: > > __lll_unlock_wake+26 > > CompileBroker::compile_method(methodHandle const&, int, int, > methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 > > CompilationPolicy::compile(methodHandle const&, int, CompLevel, > JavaThread*)+456 > > CompilationPolicy::event(methodHandle const&, methodHandle const&, > int, int, CompLevel, CompiledMethod*, JavaThread*)+553 > > InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, > unsigned char*)+331 > > InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned > char*)+27 > > Interpreter+15968 > > ??????? void > com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 > > kstack: > > syscall_trace_enter+686 > > syscall_trace_enter+686 > > do_syscall_64+326 > > entry_SYSCALL_64_after_hwframe+102 > > ustack: > > ???????__lll_unlock_wake+26 > > CompileBroker::compile_method(methodHandle const&, int, int, > methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 > > CompilationPolicy::compile(methodHandle const&, int, CompLevel, > JavaThread*)+456 > > CompilationPolicy::event(methodHandle const&, methodHandle const&, > int, int, CompLevel, CompiledMethod*, JavaThread*)+553 > > InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, > unsigned char*)+331 > > InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned > char*)+27 > > Interpreter+15968 > > ??????? void > com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 > > kstack: > > syscall_slow_exit_work+179 > > syscall_slow_exit_work+179 > > do_syscall_64+365 > > entry_SYSCALL_64_after_hwframe+102 > > ustack: > > __lll_unlock_wake+26 > > CompilationPolicy::compile(methodHandle const&, int, CompLevel, > JavaThread*)+456 > > CompilationPolicy::event(methodHandle const&, methodHandle const&, > int, int, CompLevel, CompiledMethod*, JavaThread*)+553 > > InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, > unsigned char*)+331 > > InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned > char*)+27 > > Interpreter+15968 > > ??????? void > com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 > > kstack: > > syscall_trace_enter+686 > > syscall_trace_enter+686 > > do_syscall_64+326 > > entry_SYSCALL_64_after_hwframe+102 > > ustack: > > __lll_unlock_wake+26 > > CompilationPolicy::compile(methodHandle const&, int, CompLevel, > JavaThread*)+456 > > CompilationPolicy::event(methodHandle const&, methodHandle const&, > int, int, CompLevel, CompiledMethod*, JavaThread*)+553 > > InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, > unsigned char*)+331 > > InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned > char*)+27 > > Interpreter+15968 > > ??????? void > com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 > > kstack: > > syscall_slow_exit_work+179 > > syscall_slow_exit_work+179 > > do_syscall_64+365 > > entry_SYSCALL_64_after_hwframe+102 > > ustack: > > __lll_lock_wait+29 > > DirectivesStack::getMatchingDirective(methodHandle const&, > AbstractCompiler*)+50 > > CompileBroker::compile_method(methodHandle const&, int, int, > methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 > > CompilationPolicy::compile(methodHandle const&, int, CompLevel, > JavaThread*)+456 > > CompilationPolicy::event(methodHandle const&, methodHandle const&, > int, int, CompLevel, CompiledMethod*, JavaThread*)+553 > > InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, > unsigned char*)+331 > > InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned > char*)+27 > > Interpreter+15968 > > ??????? void > com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 > > kstack: > > syscall_trace_enter+686 > > syscall_trace_enter+686 > > do_syscall_64+326 > > entry_SYSCALL_64_after_hwframe+102 > > ustack: > > __lll_lock_wait+29 > > DirectivesStack::getMatchingDirective(methodHandle const&, > AbstractCompiler*)+50 > > CompileBroker::compile_method(methodHandle const&, int, int, > methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+85 > > CompilationPolicy::compile(methodHandle const&, int, CompLevel, > JavaThread*)+456 > > CompilationPolicy::event(methodHandle const&, methodHandle const&, > int, int, CompLevel, CompiledMethod*, JavaThread*)+553 > > InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*, > unsigned char*)+331 > > InterpreterRuntime::frequency_counter_overflow(JavaThread*, unsigned > char*)+27 > > Interpreter+15968 > > ??????? void > com.hsbc.efx.actor.dispatcher.SingleThreadDispatcher$DispatchThread.run()+788 > > kstack: > > syscall_slow_exit_work+179 > > syscall_slow_exit_work+179 > > do_syscall_64+365 > > entry_SYSCALL_64_after_hwframe+102 > > This usually starts happening a few minutes after we start the > application, probably long enough to reach compilation thresholds. It > can last for few to tens of minutes after which it might fix itself > and all this activity disappears. > This does not happen on any jdk17 version that we used. My gut feeling > is some sort of a live lock somewhere in the profiler? We have limited > means of reproducing it due to environment constraints but if you?d > like us to run with some extra flags or on a fast debug build, we > could arrange that. > Also, I?m OpenJDK author with access to the bug tracker if you think > we should create and issue for this. > > Thanks > > *Wojciech KUDLA* > > eFX eRisk Infrastructure > *HSBC Bank plc* > 8 Canada Square, London E14 5HQ > Telephone: +44 (0)203 359 3827 > Mobile: +44 7895 833 903 > E-mail: wojciech.kudla at hsbc.com > > > PUBLIC > > ------------------------------------------------------------------------ > -SAVE PAPER - THINK BEFORE YOU PRINT! > > This E-mail is confidential. > > It may also be legally privileged. If you are not the addressee you > may not copy, > forward, disclose or use any part of it. If you have received this > message in error, > please delete it and all copies from your system and notify the sender > immediately by > return E-mail. > > Internet communications cannot be guaranteed to be timely secure, > error or virus-free. > The sender does not accept liability for any errors or omissions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpai at openjdk.org Sat Nov 15 05:54:10 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Sat, 15 Nov 2025 05:54:10 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v15] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Fri, 14 Nov 2025 04:38:43 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > cleanup src/hotspot/os/bsd/os_bsd.hpp line 43: > 41: // Bsd_OS defines the interface to Bsd operating systems > 42: > 43: static constexpr int bsd_mmap_fd = I don't have familiarity in this area, but looking at the man page of `mmap` on my local macos, it states this: void * mmap(void *addr, size_t len, int prot, int flags, int fd, off_t offset); ... MAP_ANON Map anonymous memory not associated with any specific file. The offset argument is ignored. Mac OS X specific: the file descriptor used for creating MAP_ANON regions can be used to pass some Mach VM flags, and can be specified as -1 if no such flags are associated with the region. Mach VM flags are defined in and the ones that currently apply to mmap are: VM_FLAGS_PURGABLE to create Mach purgable (i.e. volatile) memory. VM_MAKE_TAG(tag) to associate an 8-bit tag with the region. defines some preset tags (with a VM_MEMORY_ prefix). Users are encouraged to use tags between 240 and 255. Tags are used by tools such as vmmap(1) to help identify specific memory regions. So this special value handling of `fd` value is only applicable if `MAP_ANON` is part of the `flags`. Given this, should the name of constexpr be a bit more specific and the call sites, where this gets used, verify/assert that the flags indeed contains the `MAP_ANON` flag? Plus, this is very macos specific, calling it `bsd_...` feels much more generic. Maybe we should consider naming it `macos_mmap_anon_fd`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2529614816 From liach at openjdk.org Sat Nov 15 08:59:19 2025 From: liach at openjdk.org (Chen Liang) Date: Sat, 15 Nov 2025 08:59:19 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v12] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 12:05:51 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 59 commits: > > - Merge branch 'master' into JDK-8353835 > - Cleanup > - More cleanup of Field.set API docs, including some restructure from Alex > - Cleanup > - Merge branch 'master' into JDK-8353835 > - Update mutateFinals/modules test to exercise exports and opens cases > - Update Field.set spec to better align with setAccessible for public final field in public class in exported package > - Fix typo in java man page > - Add method to test if package exported > - Remove dup end body tag > - ... and 49 more: https://git.openjdk.org/jdk/compare/9eaa364a...7693e8fa src/java.base/share/classes/java/lang/Module.java line 1032: > 1030: * Updates this module to export a package to another module. > 1031: * > 1032: * @apiNote Used addExports, Instrumentation::redefineModule, and --add-exports Suggestion: * @apiNote Used by addExports, Instrumentation::redefineModule, and --add-exports src/java.base/share/classes/java/lang/reflect/Field.java line 1621: > 1619: private String notAccessibleToCallerMessage(Class caller, boolean unreflect) { > 1620: String exportOrOpen = Modifier.isPublic(modifiers) > 1621: && Modifier.isPublic(clazz.getModifiers()) ? "exports" : "open"; Suggestion: && Modifier.isPublic(clazz.getModifiers()) ? "export" : "open"; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2529101056 PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2529099290 From alanb at openjdk.org Sat Nov 15 08:59:20 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 15 Nov 2025 08:59:20 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v12] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 22:05:47 GMT, Chen Liang wrote: >> Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 59 commits: >> >> - Merge branch 'master' into JDK-8353835 >> - Cleanup >> - More cleanup of Field.set API docs, including some restructure from Alex >> - Cleanup >> - Merge branch 'master' into JDK-8353835 >> - Update mutateFinals/modules test to exercise exports and opens cases >> - Update Field.set spec to better align with setAccessible for public final field in public class in exported package >> - Fix typo in java man page >> - Add method to test if package exported >> - Remove dup end body tag >> - ... and 49 more: https://git.openjdk.org/jdk/compare/9eaa364a...7693e8fa > > src/java.base/share/classes/java/lang/reflect/Field.java line 1621: > >> 1619: private String notAccessibleToCallerMessage(Class caller, boolean unreflect) { >> 1620: String exportOrOpen = Modifier.isPublic(modifiers) >> 1621: && Modifier.isPublic(clazz.getModifiers()) ? "exports" : "open"; > > Suggestion: > > && Modifier.isPublic(clazz.getModifiers()) ? "export" : "open"; With InaccessibleObjectException we put the (contextual) keyword in double quotes so that the exception message has `"exports" $P` or `"opens" $P`, and hopefully guide the developer to the module declaration. For this IllegalAccessException case then it should probably be the same so that the message has `"exports" $P to module $M`, in which case it should be "opens" rather than "open". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2529713811 From stuefe at openjdk.org Sat Nov 15 10:33:08 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 15 Nov 2025 10:33:08 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v15] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Sat, 15 Nov 2025 05:51:03 GMT, Jaikiran Pai wrote: >> Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: >> >> cleanup > > src/hotspot/os/bsd/os_bsd.hpp line 43: > >> 41: // Bsd_OS defines the interface to Bsd operating systems >> 42: >> 43: static constexpr int bsd_mmap_fd = > > I don't have familiarity in this area, but looking at the man page of `mmap` on my local macos, it states this: > > > void * > mmap(void *addr, size_t len, int prot, int flags, int fd, off_t offset); > ... > MAP_ANON Map anonymous memory not associated with any specific file. The offset argument is ignored. Mac OS X specific: the file descriptor used for > creating MAP_ANON regions can be used to pass some Mach VM flags, and can be specified as -1 if no such flags are associated with the region. > Mach VM flags are defined in and the ones that currently apply to mmap are: > > VM_FLAGS_PURGABLE to create Mach purgable (i.e. volatile) memory. > > VM_MAKE_TAG(tag) to associate an 8-bit tag with the region. > defines some preset tags (with a VM_MEMORY_ prefix). Users are encouraged to use tags between 240 and 255. Tags are used > by tools such as vmmap(1) to help identify specific memory regions. > > > So this special value handling of `fd` value is only applicable if `MAP_ANON` is part of the `flags`. > Given this, should the name of constexpr be a bit more specific and the call sites, where this gets used, verify/assert that the flags indeed contains the `MAP_ANON` flag? > Plus, this is very macos specific, calling it `bsd_...` feels much more generic. Maybe we should consider naming it `macos_mmap_anon_fd`? @jaikiran If `MAP_ANON` is not set, a file descriptor will be given. Contract for MAP_ANON (or MAP_ANONYMOUS on Linux). About the bsd.. naming: I think this is okay. The source split between xxBSD and Mac is long overdue. I think there is (was?) someone from the FreeBSD foundation working on it. That said, I think this should not be a global constant but live in the os::BSD namespace/class. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2529763594 From stuefe at openjdk.org Sat Nov 15 10:45:14 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 15 Nov 2025 10:45:14 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v15] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Fri, 14 Nov 2025 04:38:43 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > cleanup What is the benefit of just tagging all mappings the JVM process does in HotSpot somewhat indiscriminately as "java"? Would it not typically be just a list of, well, all anonymous mappings done with mmap by the hotspot library? But then, it leaves other mappings out somewhat arbitrarily: mmaps from JDK libraries (e.g. NIO, zlib), thread stacks, or mmaps done with a backing file (e.g. if heap is created on NVRAM). Note that we added a diagnostic command back in 2021 (https://bugs.openjdk.org/browse/JDK-8318636, `System.map`). That command shows all mappings in great detail. It was ported to MacOS (https://bugs.openjdk.org/browse/JDK-8319875) and should show a complete list of all memory mappings, decorated with NMT tags. So you would see all mappings and what they were created for (e.g. you see Java Heap, Code cache, Metaspace, Thread stacks etc). Would this not already give us more detail than this tagging would provides us with? src/hotspot/os/bsd/os_bsd.hpp line 43: > 41: // Bsd_OS defines the interface to Bsd operating systems > 42: > 43: static constexpr int bsd_mmap_fd = @nityarai08 Can this live in os::Bsd? ------------- PR Review: https://git.openjdk.org/jdk/pull/27868#pullrequestreview-3467925797 PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2529764035 From jpai at openjdk.org Sat Nov 15 11:39:06 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Sat, 15 Nov 2025 11:39:06 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v15] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: <_5s6rTKQFmxHZHrBuvpTRzKy35lX45_TwBrymm7uxIU=.df72a0f6-a15f-41f2-98ef-60b3677f7d30@github.com> On Sat, 15 Nov 2025 10:29:03 GMT, Thomas Stuefe wrote: >> src/hotspot/os/bsd/os_bsd.hpp line 43: >> >>> 41: // Bsd_OS defines the interface to Bsd operating systems >>> 42: >>> 43: static constexpr int bsd_mmap_fd = >> >> I don't have familiarity in this area, but looking at the man page of `mmap` on my local macos, it states this: >> >> >> void * >> mmap(void *addr, size_t len, int prot, int flags, int fd, off_t offset); >> ... >> MAP_ANON Map anonymous memory not associated with any specific file. The offset argument is ignored. Mac OS X specific: the file descriptor used for >> creating MAP_ANON regions can be used to pass some Mach VM flags, and can be specified as -1 if no such flags are associated with the region. >> Mach VM flags are defined in and the ones that currently apply to mmap are: >> >> VM_FLAGS_PURGABLE to create Mach purgable (i.e. volatile) memory. >> >> VM_MAKE_TAG(tag) to associate an 8-bit tag with the region. >> defines some preset tags (with a VM_MEMORY_ prefix). Users are encouraged to use tags between 240 and 255. Tags are used >> by tools such as vmmap(1) to help identify specific memory regions. >> >> >> So this special value handling of `fd` value is only applicable if `MAP_ANON` is part of the `flags`. >> Given this, should the name of constexpr be a bit more specific and the call sites, where this gets used, verify/assert that the flags indeed contains the `MAP_ANON` flag? >> Plus, this is very macos specific, calling it `bsd_...` feels much more generic. Maybe we should consider naming it `macos_mmap_anon_fd`? > > @jaikiran If `MAP_ANON` is not set, a file descriptor will be given. Contract for MAP_ANON (or MAP_ANONYMOUS on Linux). > > About the bsd.. naming: > > I think this is okay. The source split between xxBSD and Mac is long overdue. I think there is (was?) someone from the FreeBSD foundation working on it. > > That said, I think this should not be a global constant but live in the os::BSD namespace/class. Hello Thomas, > If MAP_ANON is not set, a file descriptor will be given. Contract for MAP_ANON (or MAP_ANONYMOUS on Linux). I was thinking about call sites like this https://github.com/openjdk/jdk/pull/27868/files#diff-1f93205c2e57bee432f8fb7a0725ba1dfdbe5b901ac63010ea0b43922e34ac12R1692 : char* addr = (char*)::mmap(requested_addr, bytes, PROT_NONE, flags, bsd_mmap_fd, 0); where the `flags` may be defined a few lines away and it may not be easy to guarantee/spot that those `flags` have `MAP_ANON`. Using this new `bsd_mmap_fd` in such places without additional checks on `flags`, I suspect, might lead to subtle issues (if for example, those flags are updated in future)? I don't have any prior knowledge of the code that's being updated here, so my concern may not be practical after all. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2529831333 From kbarrett at openjdk.org Sun Nov 16 01:03:06 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 16 Nov 2025 01:03:06 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v15] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Fri, 14 Nov 2025 04:38:43 GMT, Nityanand Rai wrote: >> Add VM_MEMORY_JAVA tag to mmap calls in os_bsd.cpp for better memory tracking of java process on macOs > > Nityanand Rai has updated the pull request incrementally with one additional commit since the last revision: > > cleanup test/hotspot/gtest/testutils.cpp line 70: > 68: } > 69: > 70: #if APPLE_MEMORY_TAGGING_AVAILABLE Why is this stuff in testutils, rather than just being co-located with the one using test? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27868#discussion_r2530716514 From kbarrett at openjdk.org Sun Nov 16 01:10:40 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 16 Nov 2025 01:10:40 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v3] In-Reply-To: References: Message-ID: > 8369187: Add wrapper for that forbids use of global allocation and deallocation functions > > Please review this change that adds `cppstdlib/new.hpp` as a wrapper for > including ``. All existing inclusions of `` are changed to include > the new wrapper. > > In additional to including ``, this wrapper also provides deprecation > declarations to prevent the use of some facilities by HotSpot code. > > However, those deprecations need to be conditionalized to not apply to gtests, > so this change also adds a macro definition provided by the build system for > use in detecting that a header is being included by a gtest. > > Testing: mach5 tier1 Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into wrap-stdlib-new - poison implicit alloc/dealloc in globalDefinitions - Merge branch 'master' into wrap-stdlib-new - further conditionalize deprecation of hardare interference sizes - add wrapper for ------------- Changes: https://git.openjdk.org/jdk/pull/28250/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28250&range=02 Stats: 207 lines in 15 files changed: 187 ins; 20 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28250.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28250/head:pull/28250 PR: https://git.openjdk.org/jdk/pull/28250 From kbarrett at openjdk.org Sun Nov 16 01:10:41 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 16 Nov 2025 01:10:41 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v2] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 05:58:25 GMT, Kim Barrett wrote: >> 8369187: Add wrapper for that forbids use of global allocation and deallocation functions >> >> Please review this change that adds `cppstdlib/new.hpp` as a wrapper for >> including ``. All existing inclusions of `` are changed to include >> the new wrapper. >> >> In additional to including ``, this wrapper also provides deprecation >> declarations to prevent the use of some facilities by HotSpot code. >> >> However, those deprecations need to be conditionalized to not apply to gtests, >> so this change also adds a macro definition provided by the build system for >> use in detecting that a header is being included by a gtest. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'master' into wrap-stdlib-new > - further conditionalize deprecation of hardare interference sizes > - add wrapper for I added to globalDefinitions.hpp deprecating declarations for some of the implicitly declared allocation and deallocation functions. This provides better coverage for the usage poisoning. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28250#issuecomment-3537249708 From kbarrett at openjdk.org Sun Nov 16 01:17:16 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 16 Nov 2025 01:17:16 GMT Subject: RFR: 8370254: Add VM_MEMORY_JAVA mmap tag to MacOS mmap calls [v15] In-Reply-To: References: <3oC_tXLUghBm6DYolHcOZf5ne2kmFbeK2xmEe08GB6w=.72a04750-20bc-4ef7-9d2c-5f411b2e70ef@github.com> Message-ID: On Sat, 15 Nov 2025 10:42:03 GMT, Thomas Stuefe wrote: > What is the benefit of just tagging all mappings the JVM process does in HotSpot somewhat indiscriminately as "java"? I was wondering much the same thing as @tstuefe , but with less knowledge than him about what we already have. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27868#issuecomment-3537265187 From thartmann at openjdk.org Sun Nov 16 10:35:46 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Sun, 16 Nov 2025 10:35:46 GMT Subject: Integrated: 8371958: [BACKOUT] 8371709: Add CTW to hotspot_compiler testing Message-ID: Clean backout because the change is causing massive failures in our CI at higher tiers. We need to fix the bugs first that are triggered by this. Thanks, Tobias ------------- Commit messages: - Revert "8371709: Add CTW to hotspot_compiler testing" Changes: https://git.openjdk.org/jdk/pull/28339/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28339&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371958 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28339.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28339/head:pull/28339 PR: https://git.openjdk.org/jdk/pull/28339 From ayang at openjdk.org Sun Nov 16 10:35:46 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Sun, 16 Nov 2025 10:35:46 GMT Subject: Integrated: 8371958: [BACKOUT] 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 10:25:42 GMT, Tobias Hartmann wrote: > Clean backout because the change is causing massive failures in our CI at higher tiers. We need to fix the bugs first that are triggered by this. > > Thanks, > Tobias Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28339#pullrequestreview-3470034575 From thartmann at openjdk.org Sun Nov 16 10:35:46 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Sun, 16 Nov 2025 10:35:46 GMT Subject: Integrated: 8371958: [BACKOUT] 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 10:27:49 GMT, Albert Mingkun Yang wrote: >> Clean backout because the change is causing massive failures in our CI at higher tiers. We need to fix the bugs first that are triggered by this. >> >> Thanks, >> Tobias > > Marked as reviewed by ayang (Reviewer). Thanks for the review @albertnetymk! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28339#issuecomment-3538508926 From thartmann at openjdk.org Sun Nov 16 10:35:46 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Sun, 16 Nov 2025 10:35:46 GMT Subject: Integrated: 8371958: [BACKOUT] 8371709: Add CTW to hotspot_compiler testing In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 10:25:42 GMT, Tobias Hartmann wrote: > Clean backout because the change is causing massive failures in our CI at higher tiers. We need to fix the bugs first that are triggered by this. > > Thanks, > Tobias This pull request has now been integrated. Changeset: 7d35a283 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/7d35a283cf2497565d230e3d5426f563f7e5870d Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8371958: [BACKOUT] 8371709: Add CTW to hotspot_compiler testing Reviewed-by: ayang ------------- PR: https://git.openjdk.org/jdk/pull/28339 From aph at openjdk.org Sun Nov 16 11:42:01 2025 From: aph at openjdk.org (Andrew Haley) Date: Sun, 16 Nov 2025 11:42:01 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 17:41:56 GMT, Kurt Miller wrote: > ?rGenerator::generate_native_entry > > I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: > > > // get native function entry point in r10 > { > Label L; > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); > __ lea(rscratch2, unsatisfied); > __ ldr(rscratch2, rscratch2); > __ cmp(r10, rscratch2); > __ br(Assembler::NE, L); > __ call_VM(noreg, > CAST_FROM_FN_PTR(address, > InterpreterRuntime::prepare_native_call), > rmethod); > __ get_method(rmethod); > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > __ bind(L); > } > > > If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. > > This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. > > This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. > > Updated comment with markdown for code. Ah yes, I see. The mistake was mine: I thought `__ cmpptr(rax, unsatisfied.addr(), rscratch1)` meant `__ cmpptr(rax, unsatisfied, rscratch1)`. In other words, I missed the significance of `.addr()`. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28327#pullrequestreview-3470092793 From mdoerr at openjdk.org Sun Nov 16 13:23:07 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sun, 16 Nov 2025 13:23:07 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v2] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Thu, 13 Nov 2025 11:35:04 GMT, Albert Mingkun Yang wrote: >> We have seen crashes on many platforms (including x64) while running `make run-test TEST=runtime/cds/appcds/aotClassLinking/LambdaInExcludedClass.java JTREG="VM_OPTIONS=-XX:+UseCompactObjectHeaders"`: >> >> SIGSEGV (0xb) at pc=0x00007f2f95a61e7a, pid=18554, tid=18557 >> V [libjvm.so+0x15bfe7a] MemAllocator::finish(HeapWordImpl**) const+0xca (klass.inline.hpp:72) >> V [libjvm.so+0x15c029f] ObjAllocator::initialize(HeapWordImpl**) const+0x2f (memAllocator.cpp:391) >> V [libjvm.so+0xb0630b] CollectedHeap::fill_with_object(HeapWordImpl**, unsigned long, bool)+0x27b (collectedHeap.cpp:491) >> V [libjvm.so+0x1c7a0bb] ThreadLocalAllocBuffer::retire(ThreadLocalAllocStats*)+0x11b (threadLocalAllocBuffer.cpp:118) >> V [libjvm.so+0x15c0b14] MemAllocator::mem_allocate_inside_tlab_slow(MemAllocator::Allocation&) const+0x84 (memAllocator.cpp:286) >> V [libjvm.so+0x15c13ab] MemAllocator::mem_allocate(MemAllocator::Allocation&) const+0xbb (memAllocator.cpp:340) >> V [libjvm.so+0x15c14f9] MemAllocator::allocate() const+0xa9 (memAllocator.cpp:353) >> V [libjvm.so+0x1cc052e] TypeArrayKlass::allocate_common(int, bool, JavaThread*)+0x13e (collectedHeap.inline.hpp:41) >> V [libjvm.so+0x16fbc98] oopFactory::new_typeArray(BasicType, int, JavaThread*)+0x38 (typeArrayKlass.hpp:51) >> V [libjvm.so+0x106b0f3] java_lang_Class::restore_archived_mirror(Klass*, Handle, Handle, Handle, JavaThread*)+0x413 (javaClasses.cpp:1246) >> V [libjvm.so+0x14100bc] Klass::restore_unshareable_info(ClassLoaderData*, Handle, JavaThread*)+0x66c (klass.cpp:903) >> V [libjvm.so+0xfe2cb1] InstanceKlass::restore_unshareable_info(ClassLoaderData*, Handle, PackageEntry*, JavaThread*)+0x81 (instanceKlass.cpp:2823) >> V [libjvm.so+0x1c0f5ad] SystemDictionary::preload_class(Handle, InstanceKlass*, JavaThread*)+0x1ed (systemDictionary.cpp:1198) >> V [libjvm.so+0x676e83] AOTLinkedClassBulkLoader::preload_classes_in_table(Array*, char const*, Handle, JavaThread*)+0x1a3 (aotLinkedClassBulkLoader.cpp:103) >> V [libjvm.so+0x679af5] AOTLinkedClassBulkLoader::preload_classes_impl(JavaThread*)+0x165 (aotLinkedClassBulkLoader.cpp:76) >> V [libjvm.so+0x67c371] AOTLinkedClassBulkLoader::preload_classes(JavaThread*)+0x11 (aotLinkedClassBulkLoader.cpp:61) >> V [libjvm.so+0x1d5bf30] vmClasses::resolve_all(JavaThread*)+0x3e0 (vmClasses.cpp:126) >> V [libjvm.so+0x1c0f28c] SystemDictionary::initialize(JavaThread*)+0x10c (systemDictionary.cpp:1623) >> V [libjvm.so+0x1cc74ca] Uni... > >> ... make run-test TEST=runtime/cds/appcds/aotClassLinking/LambdaInExcludedClass.java JTREG="VM_OPTIONS=-XX:+UseCompactObjectHeaders" > > I suspect the crash is caused by a preexisting issue that is exposed by this patch. > > In `vmClasses::resolve_all`: > > #if INCLUDE_CDS > if (CDSConfig::is_using_aot_linked_classes()) { > AOTLinkedClassBulkLoader::preload_classes(THREAD); > } > #endif > > // Preload commonly used klasses > vmClassID scan = vmClassID::FIRST; > // first do Object, then String, Class > resolve_through(VM_CLASS_ID(Object_klass), scan, CHECK); > CollectedHeap::set_filler_object_klass(vmClasses::Object_klass()); > > > The filler-klass is not initialized when `preload_classes` is invoked, but `preload_classes` use heap-allocation, which may require filler-obj. > > @iklam What do you think? > @albertnetymk I've pushed #28315. Please verify if it fixes the crash before integrating this PR. Thanks! The crashes are fixed. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3538748578 From mdoerr at openjdk.org Sun Nov 16 13:42:02 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sun, 16 Nov 2025 13:42:02 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch In-Reply-To: <8ikwqN309ZPAORjL2YvE1hgvChrTfhi3slz1r4XIK5E=.41d37c1b-ae3e-43f5-9fe0-43ae2294e57c@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> <8ikwqN309ZPAORjL2YvE1hgvChrTfhi3slz1r4XIK5E=.41d37c1b-ae3e-43f5-9fe0-43ae2294e57c@github.com> Message-ID: On Tue, 11 Nov 2025 16:18:04 GMT, Vladimir Kozlov wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Please ask all OpenJDK platforms supporters to test these changes. > > Note, when this code was introduced we did not have so many platforms. This change looks incomplete to me. @vnkozlov: Shouldn't we remove `AllocatePrefetchStyle == 3` completely? `PhaseMacroExpand::prefetch_allocation` still mentions "BIS instruction is used on SPARC as prefetch". Please note that PPC64 also still has an implementation for it (nodes with `predicate(AllocatePrefetchStyle == 3)`). I guess that we don't need it any more. Maybe we should check performance again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3538765487 From ayang at openjdk.org Sun Nov 16 17:19:13 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Sun, 16 Nov 2025 17:19:13 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v3] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Fri, 14 Nov 2025 19:22:39 GMT, Albert Mingkun Yang wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into remove-tlab-reserve > - review > - remove-tlab-reserve `AllocatePrefetchStyle == 3` means "generate one prefetch instruction per cache line". This patch removes the reserved alignment that was required for `AllocatePrefetchStyle == 3` on SPARC, but that does not imply that `AllocatePrefetchStyle == 3` itself should be removed. (Maybe a separate ticket if it's indeed deemed useless.) > PhaseMacroExpand::prefetch_allocation still mentions "BIS instruction is used on SPARC as prefetch". Will remove it in the next revision. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3538970541 From mpowers at openjdk.org Sun Nov 16 17:24:15 2025 From: mpowers at openjdk.org (Mark Powers) Date: Sun, 16 Nov 2025 17:24:15 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements In-Reply-To: References: Message-ID: <_TeZd3joeNkWYg7ZOgYRwzRJJjwMcUVOfe-pdXzJTv4=.d413a241-c8de-4267-8b98-0b41c7629371@github.com> On Tue, 4 Nov 2025 16:38:49 GMT, Volodymyr Paprotski wrote: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" You might want to have @kuksenko or @ericcaspole look at MLDSABench.java. test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 29: > 27: import java.lang.invoke.MethodHandle; > 28: import java.lang.invoke.MethodHandles; > 29: import java.lang.reflect.Field; unused import statement test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 31: > 29: import java.lang.reflect.Field; > 30: import java.lang.reflect.Method; > 31: import java.lang.reflect.Constructor; unused import test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 123: > 121: try { > 122: for (int i = 0; i < repeat; i++) { > 123: // seed = rnd.nextLong(); 2 lines commented out test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 219: > 217: int[] coeffs3 = new int[ML_DSA_N]; > 218: for (int j = 0; j 219: coeffs3[j] = `coeffs3` is written to but never read test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 517: > 515: }; > 516: } > 517: // java --add-opens java.base/sun.security.provider=ALL-UNNAMED -XX:+UseDilithiumIntrinsics test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java This is line is useful. Not sure I would hide it at the bottom of the file. test/micro/org/openjdk/bench/javax/crypto/full/MLDSABench.java line 2: > 1: /* > 2: * Copyright (c) 2015, 2018, Oracle and/or its affiliates. All rights reserved. Copyright date. ------------- Marked as reviewed by mpowers (Committer). PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3470287661 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532070492 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532071025 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532075447 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532074544 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532078122 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532078790 From alanb at openjdk.org Sun Nov 16 17:26:50 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 16 Nov 2025 17:26:50 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v13] In-Reply-To: References: Message-ID: > Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). > > Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. > > HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). > > There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. > > Testing: tier1-6 Alan Bateman has updated the pull request incrementally with two additional commits since the last revision: - More wordsmithing - Improve IAE exception message ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25115/files - new: https://git.openjdk.org/jdk/pull/25115/files/7693e8fa..e935c32e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25115&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25115&range=11-12 Stats: 38 lines in 4 files changed: 18 ins; 0 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/25115.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25115/head:pull/25115 PR: https://git.openjdk.org/jdk/pull/25115 From liach at openjdk.org Sun Nov 16 19:29:21 2025 From: liach at openjdk.org (Chen Liang) Date: Sun, 16 Nov 2025 19:29:21 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v13] In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 17:26:50 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request incrementally with two additional commits since the last revision: > > - More wordsmithing > - Improve IAE exception message src/java.base/share/man/java.md line 471: > 469: : Mutation of final fields is possible with the reflection API of the Java Platform. > 470: _However, it compromises safety and performance in all programs. > 471: This option allows code_ in the specified modules to mutate final fields by reflection. Intended that this emphasis ends at "code" instead of the end of the last sentence? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25115#discussion_r2532177598 From mdoerr at openjdk.org Sun Nov 16 21:13:05 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sun, 16 Nov 2025 21:13:05 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v3] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Fri, 14 Nov 2025 19:22:39 GMT, Albert Mingkun Yang wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into remove-tlab-reserve > - review > - remove-tlab-reserve At least PPC64 will need an update because your change breaks the following nodes: diff --git a/src/hotspot/cpu/ppc/ppc.ad b/src/hotspot/cpu/ppc/ppc.ad index 7fcd096d2ad..c169d673aaf 100644 --- a/src/hotspot/cpu/ppc/ppc.ad +++ b/src/hotspot/cpu/ppc/ppc.ad @@ -6328,36 +6328,8 @@ instruct loadConD_Ex(regD dst, immD src) %{ // Prefetch instructions. // Must be safe to execute with invalid address (cannot fault). -// Special prefetch versions which use the dcbz instruction. -instruct prefetch_alloc_zero(indirectMemory mem, iRegLsrc src) %{ - match(PrefetchAllocation (AddP mem src)); - predicate(AllocatePrefetchStyle == 3); - ins_cost(MEMORY_REF_COST); - - format %{ "PREFETCH $mem, 2, $src \t// Prefetch write-many with zero" %} - size(4); - ins_encode %{ - __ dcbz($src$$Register, $mem$$base$$Register); - %} - ins_pipe(pipe_class_memory); -%} - -instruct prefetch_alloc_zero_no_offset(indirectMemory mem) %{ - match(PrefetchAllocation mem); - predicate(AllocatePrefetchStyle == 3); - ins_cost(MEMORY_REF_COST); - - format %{ "PREFETCH $mem, 2 \t// Prefetch write-many with zero" %} - size(4); - ins_encode %{ - __ dcbz($mem$$base$$Register); - %} - ins_pipe(pipe_class_memory); -%} - instruct prefetch_alloc(indirectMemory mem, iRegLsrc src) %{ match(PrefetchAllocation (AddP mem src)); - predicate(AllocatePrefetchStyle != 3); ins_cost(MEMORY_REF_COST); format %{ "PREFETCH $mem, 2, $src \t// Prefetch write-many" %} @@ -6370,7 +6342,6 @@ instruct prefetch_alloc(indirectMemory mem, iRegLsrc src) %{ instruct prefetch_alloc_no_offset(indirectMemory mem) %{ match(PrefetchAllocation mem); - predicate(AllocatePrefetchStyle != 3); ins_cost(MEMORY_REF_COST); format %{ "PREFETCH $mem, 2 \t// Prefetch write-many" %} ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3539357960 From dholmes at openjdk.org Mon Nov 17 01:26:14 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Nov 2025 01:26:14 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v11] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> On Thu, 30 Oct 2025 12:06:00 GMT, Afshin Zafari wrote: >> The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. >> The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. >> >> Tests: >> linux-x64 tier1 > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > fix arguments.cpp for HeapMinBaseAddress type. Sorry but I am finding that the actual fix here is getting lost in a lot of not-obviously-needed changes to variable type declarations. src/hotspot/share/memory/memoryReserver.cpp line 549: > 547: const size_t attach_point_alignment = lcm(alignment, os_attach_point_alignment); > 548: > 549: uintptr_t aligned_heap_base_min_address = align_up(MAX2(HeapBaseMinAddress, alignment), alignment); Just to be clear, this is the crux of the fix, where we ensure the min-address is now never zero - right? src/hotspot/share/memory/memoryReserver.cpp line 586: > 584: lowest_start, highest_start); > 585: reserved = try_reserve_range((char*)highest_start, (char*)lowest_start, attach_point_alignment, > 586: (char*)aligned_heap_base_min_address, (char*)UnscaledOopHeapMax, size, alignment, page_size); Not obvious to me this actually improves anything - what is it fixing? src/hotspot/share/memory/memoryReserver.cpp line 590: > 588: > 589: // zerobased: Attempt to allocate in the lower 32G. > 590: size_t zerobased_max = OopEncodingHeapMax; Again not obvious what this improves. We obviously have very inconsistent use of types here in that we loosely use `char*`, `uint64_t` and `size_t` to all mean a 64-bit unsigned value, ansd no matter what types we use in the declarations we have to cast something somewhere. ------------- PR Review: https://git.openjdk.org/jdk/pull/26955#pullrequestreview-3470648246 PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2532394079 PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2532395457 PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2532399134 From fyang at openjdk.org Mon Nov 17 02:24:09 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 17 Nov 2025 02:24:09 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Sat, 15 Nov 2025 02:32:07 GMT, Dean Long wrote: > It looks like RISCV is broken in the same way. Fired new JBS for RISC-V: https://bugs.openjdk.org/browse/JDK-8371966 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3539694675 From wenanjian at openjdk.org Mon Nov 17 02:56:31 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Mon, 17 Nov 2025 02:56:31 GMT Subject: RFR: 8371966: RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry Message-ID: Do the same fix as aarch64 in `TemplateInterpreterGenerator::generate_native_entry()` [JDK-8371918](https://bugs.openjdk.org/browse/JDK-8371918) ------------- Commit messages: - RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry Changes: https://git.openjdk.org/jdk/pull/28343/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28343&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371966 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28343.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28343/head:pull/28343 PR: https://git.openjdk.org/jdk/pull/28343 From fyang at openjdk.org Mon Nov 17 03:06:13 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 17 Nov 2025 03:06:13 GMT Subject: RFR: 8371966: RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: <5XIH1aOd84FnYAYPrzlByJH2iYmGnqSJLUDsS1C4RE0=.84a76596-bdab-44fd-8674-efe6e7c987f6@github.com> On Mon, 17 Nov 2025 02:49:52 GMT, Anjian Wen wrote: > Do the same fix as aarch64 in `TemplateInterpreterGenerator::generate_native_entry()` [JDK-8371918](https://bugs.openjdk.org/browse/JDK-8371918) Thanks! ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28343#pullrequestreview-3470780901 From fyang at openjdk.org Mon Nov 17 03:40:33 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 17 Nov 2025 03:40:33 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC Message-ID: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Hi, please consider this riscv-specific change. I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63. `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. After this change, the log looks like: $ java -Xlog:all -version ...... [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. [0.011s][info][os,cpu ] Enabled RV64 feature "a" [0.011s][info][os,cpu ] Enabled RV64 feature "c" [0.011s][info][os,cpu ] Enabled RV64 feature "d" [0.011s][info][os,cpu ] Enabled RV64 feature "f" [0.011s][info][os,cpu ] Enabled RV64 feature "i" [0.011s][info][os,cpu ] Enabled RV64 feature "m" [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin ...... ------------- Commit messages: - 8371869: RISC-V: too many warnings when build on BPI-F3 SBC Changes: https://git.openjdk.org/jdk/pull/28340/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28340&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371869 Stats: 36 lines in 2 files changed: 27 ins; 6 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28340.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28340/head:pull/28340 PR: https://git.openjdk.org/jdk/pull/28340 From fyang at openjdk.org Mon Nov 17 06:41:08 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 17 Nov 2025 06:41:08 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> Message-ID: On Fri, 14 Nov 2025 18:03:53 GMT, Hamlin Li wrote: >> Hi, >> >> This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. >> >> This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. >> >> Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. >> >> # Test >> ## Jtreg >> >> in progress... >> >> ## Performance >> >> Column names meanings: >> * p: with patch >> * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> * m: without patch >> * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> >> #### Average improvement >> >> NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. >> >> For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. >> >> Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) >> -- | -- | -- | -- >> 1.022782609 | 2.198717391 | 2.162673913 | 2.199 >> >> > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - add CMove+CmpP/N tests > - fix cmovF/D_cmpP src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2133: > 2131: break; > 2132: case BoolTest::ge: > 2133: assert(false, "Should go to BoolTest::le case"); I am not sure if it's safe to have these assertions for `ge` and `gt`. It seems to me that we should handle all possible condition codes here. Check this bug: https://bugs.openjdk.org/browse/JDK-8358892. We have added handling for `ge` and `gt` in `C2_MacroAssembler::enc_cmove_cmp_fp` to fix it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2532878358 From jbhateja at openjdk.org Mon Nov 17 07:16:08 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 17 Nov 2025 07:16:08 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 16:38:49 GMT, Volodymyr Paprotski wrote: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" Minor initial comments src/hotspot/cpu/x86/assembler_x86.cpp line 3867: > 3865: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : > 3866: (vector_len == AVX_512bit ? VM_Version::supports_evex() : false)), ""); > 3867: InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true); When you check for AVX512-VL you allow accessing 128/256 bit registers from the higher register bank [X/Y]MM(16-31) But your assertions are nowhere checking this. src/hotspot/cpu/x86/assembler_x86.cpp line 3876: > 3874: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : > 3875: (vector_len == AVX_512bit ? VM_Version::supports_evex() : false)), ""); > 3876: InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true); When you check for AVX512-VL you allow accessing 128/256 bit registers from the higher register bank [X/Y]MM(16-31) But your assertions are nowhere checking this. src/hotspot/cpu/x86/assembler_x86.cpp line 3882: > 3880: > 3881: void Assembler::evmovsldup(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) { > 3882: assert(VM_Version::supports_evex(), ""); Suggestion: assert(vector_len == AVX_512 || VM_Version::supports_avx512vl), ""); src/hotspot/cpu/x86/assembler_x86.cpp line 3894: > 3892: > 3893: void Assembler::evmovshdup(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) { > 3894: assert(VM_Version::supports_evex(), ""); Same as above src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 397: > 395: // > 396: static address generate_dilithiumAlmostNtt_avx(StubGenerator *stubgen, > 397: int vector_len, MacroAssembler *_masm) { Indentation corretness test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 2: > 1: /* > 2: * Copyright (c) 2024, 2025, Oracle and/or its affiliates. All rights reserved. Suggestion: * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 114: > 112: rnd.setSeed(seed); > 113: //Note: it might be useful to increase this number during development of new intrinsics > 114: final int repeat = 10000000; Instead of high repetition count can you try tuning the tiered compilation threshold. test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 145: > 143: coeffs1[j] = rnd.nextInt(); > 144: coeffs2[j] = rnd.nextInt(); > 145: } You can uses generators for randome initialization of array ------------- PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3471195396 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532894350 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532894989 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532900199 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532901821 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532910907 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532868326 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532875974 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532872372 From epeter at openjdk.org Mon Nov 17 07:30:14 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 17 Nov 2025 07:30:14 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: <5PzMJntiu2waMvciTLvXaUH15Fm3dXZPsDVvkuqWPI0=.68c6456a-e5d3-413e-bef8-d8da95de40bd@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <5PzMJntiu2waMvciTLvXaUH15Fm3dXZPsDVvkuqWPI0=.68c6456a-e5d3-413e-bef8-d8da95de40bd@github.com> Message-ID: <46DWeMiCRNMC58wGr4T52KXbtRjU0PxQ4L6LuVFMZEo=.867fcc86-edd1-4492-9c1a-58f83d135969@github.com> On Fri, 14 Nov 2025 18:11:28 GMT, Hamlin Li wrote: >> test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMove.java line 36: >> >>> 34: * @test >>> 35: * @summary Test conditional move. >>> 36: * @requires vm.simpleArch == "riscv64" >> >> I would prefer if you could enable the test on all platforms, but just require the specific platform on the IR rules. >> What would be even more fantastic: if you were able to also enable the IR rules for `x64` and `aarch64`, but we can also file a follow-up RFE for that. > > Make sense. I filed https://bugs.openjdk.org/browse/JDK-8371920 to track the task, will do it later after this pr. I would suggest that you already make the move from `@requires` to IR rule level restrictions. But we can look at adding `x64` and `aarch64` in the separate RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2532986110 From epeter at openjdk.org Mon Nov 17 07:42:13 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 17 Nov 2025 07:42:13 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> Message-ID: On Fri, 14 Nov 2025 18:03:53 GMT, Hamlin Li wrote: >> Hi, >> >> This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. >> >> This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. >> >> Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. >> >> # Test >> ## Jtreg >> >> in progress... >> >> ## Performance >> >> Column names meanings: >> * p: with patch >> * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> * m: without patch >> * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> >> #### Average improvement >> >> NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. >> >> For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. >> >> Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) >> -- | -- | -- | -- >> 1.022782609 | 2.198717391 | 2.162673913 | 2.199 >> >> > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - add CMove+CmpP/N tests > - fix cmovF/D_cmpP test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMoveCmpObj.java line 131: > 129: // applyIf = {"UseCompressedOops", "false"}) > 130: // @IR(counts = {IRNode.CMOVE_L, ">0", IRNode.CMP_N, ">0"}, > 131: // applyIf = {"UseCompressedOops", "true"}) Do you plan to still do this in this PR? Probably a future RFE would be better. It could be nice if you could link to the RFE with the issue number from this comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2533013052 From epeter at openjdk.org Mon Nov 17 07:46:12 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 17 Nov 2025 07:46:12 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: <5PzMJntiu2waMvciTLvXaUH15Fm3dXZPsDVvkuqWPI0=.68c6456a-e5d3-413e-bef8-d8da95de40bd@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <5PzMJntiu2waMvciTLvXaUH15Fm3dXZPsDVvkuqWPI0=.68c6456a-e5d3-413e-bef8-d8da95de40bd@github.com> Message-ID: On Fri, 14 Nov 2025 18:12:56 GMT, Hamlin Li wrote: >> test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMove.java line 49: >> >>> 47: "-XX:+UnlockExperimentalVMOptions", "-XX:-UseCompactObjectHeaders"); >>> 48: TestFramework.runWithFlags("-XX:+UseCMoveUnconditionally", "-XX:-UseVectorCmov", >>> 49: "-XX:+UnlockExperimentalVMOptions", "-XX:+UseCompactObjectHeaders"); >> >> Wait. Is this just a copy of the existing vector test, but run with CMove vectorization disabled? >> If so, we could just add these additional runs to the existing test, and guard the IR test with corresponding flags: >> Have an IR rule for `-XX:-UseVectorCmov` and one for `-XX:+UseVectorCmov`. >> >> That would allow us to reduce some code duplication. And it would also avoid letting the two tests go out of sync when people add more to one but not the other. >> >> What do you think? > > Good idea! > I can do it. What do you think about the name of the merged tests? `TestConditionalMove.java` or `TestScalarAndVectorConditionalMove.java` `TestConditionalMove.java` sounds good :) It would also be nice if we could move it out of the `irTests` directory, we would like to eventually move all tests away from it, and rather sort the tests by what they test and not by how we test them. Though now it's a little tricky because we check for both vector and scalar things. Still, I would propose that you move it under `c2/vectorization` or `c2/loopopts/superword`, since they do include vectorization tests. An alternative could also be in a new `c2/cmove` directory. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2533020809 From aartemov at openjdk.org Mon Nov 17 09:04:38 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Mon, 17 Nov 2025 09:04:38 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v5] In-Reply-To: References: Message-ID: > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. > > Tested in tiers 1 - 5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8366671: Removed the file mistakenly checked in. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28264/files - new: https://git.openjdk.org/jdk/pull/28264/files/0e78affd..74cfcaea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=03-04 Stats: 566 lines in 1 file changed: 0 ins; 566 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From aartemov at openjdk.org Mon Nov 17 09:04:40 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Mon, 17 Nov 2025 09:04:40 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: <3UPdQsdoqZ42_JYICX1_lbBZYnRjM46BRD27H2Ch0yo=.2840a091-0518-4d9f-bf3e-f54a18edb7c3@github.com> On Fri, 14 Nov 2025 18:44:43 GMT, Coleen Phillimore wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Fixed build problem. > > patch.txt line 3: > >> 1: diff --git a/src/hotspot/share/runtime/objectMonitor.cpp b/src/hotspot/share/runtime/objectMonitor.cpp >> 2: index ee7629ec6f5..b1c806308ff 100644 >> 3: --- a/src/hotspot/share/runtime/objectMonitor.cpp > > I don't think this was supposed to be added. Yes, it was checked in by mistake. Removed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2533249687 From stefank at openjdk.org Mon Nov 17 09:13:43 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 17 Nov 2025 09:13:43 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v3] In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 01:10:40 GMT, Kim Barrett wrote: >> 8369187: Add wrapper for that forbids use of global allocation and deallocation functions >> >> Please review this change that adds `cppstdlib/new.hpp` as a wrapper for >> including ``. All existing inclusions of `` are changed to include >> the new wrapper. >> >> In additional to including ``, this wrapper also provides deprecation >> declarations to prevent the use of some facilities by HotSpot code. >> >> However, those deprecations need to be conditionalized to not apply to gtests, >> so this change also adds a macro definition provided by the build system for >> use in detecting that a header is being included by a gtest. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into wrap-stdlib-new > - poison implicit alloc/dealloc in globalDefinitions > - Merge branch 'master' into wrap-stdlib-new > - further conditionalize deprecation of hardare interference sizes > - add wrapper for The last change seems reasonable to me. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28250#pullrequestreview-3471725618 From ayang at openjdk.org Mon Nov 17 09:35:49 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 17 Nov 2025 09:35:49 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v4] In-Reply-To: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: > Trivial removing obsoleted code for unsupported arch. > > Test: tier1 Albert Mingkun Yang has updated the pull request incrementally with two additional commits since the last revision: - review - patch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28240/files - new: https://git.openjdk.org/jdk/pull/28240/files/d6c34da7..6c5a07ec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28240&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28240&range=02-03 Stats: 31 lines in 2 files changed: 0 ins; 30 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28240.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28240/head:pull/28240 PR: https://git.openjdk.org/jdk/pull/28240 From mli at openjdk.org Mon Nov 17 09:43:06 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 09:43:06 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> Message-ID: On Mon, 17 Nov 2025 07:39:19 GMT, Emanuel Peter wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - add CMove+CmpP/N tests >> - fix cmovF/D_cmpP > > test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMoveCmpObj.java line 131: > >> 129: // applyIf = {"UseCompressedOops", "false"}) >> 130: // @IR(counts = {IRNode.CMOVE_L, ">0", IRNode.CMP_N, ">0"}, >> 131: // applyIf = {"UseCompressedOops", "true"}) > > Do you plan to still do this in this PR? Probably a future RFE would be better. It could be nice if you could link to the RFE with the issue number from this comment. In this PR, no, this one will only implement CMoveF/D and enable the vectorization of CMoveF/D, so do some preparation for https://github.com/openjdk/jdk/pull/28231. To guarantee the generation of CMoveI/L, seems to me we need to improve the cost model when transfrom a phi to a conditional move. I can have a invetigation later, as this impact how & whether CMoveL/I can be generated and be vectorized accordingly. File https://bugs.openjdk.org/browse/JDK-8371984 to track it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2533384835 From aartemov at openjdk.org Mon Nov 17 09:45:17 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Mon, 17 Nov 2025 09:45:17 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 18:53:06 GMT, Coleen Phillimore wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Fixed build problem. > > src/hotspot/share/runtime/objectMonitor.cpp line 320: > >> 318: check_object_context(); >> 319: if (_object_strong.is_empty()) { >> 320: auto setObjectStrongLambda = [&](OopHandle& object_strong, const WeakHandle& object) { > > Doesn't the lambda capture the _object and _object_strong values from the [&] ? > And maybe instead of a class SpinSingleSection it should be a template function that passes in the lambda? According to the C++ standard, evaluation of a lambda expression creates a closure object, invoking the closure object executes the lambda expression. The closure object behaves as a functional object, and its call operator, constructor and data members are defined by the lambda expression. The capture type tells how variables are passed into the closure object, and nothing more. Variables are passed when the lambda is executed. We can of course have a template function instead of a class, but then it somewhat breaks the idea of having the an object representing the critical/single section, where the length of the section is defined by the lifetime of the object. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2533384738 From mdoerr at openjdk.org Mon Nov 17 09:53:16 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 17 Nov 2025 09:53:16 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v4] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Mon, 17 Nov 2025 09:35:49 GMT, Albert Mingkun Yang wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request incrementally with two additional commits since the last revision: > > - review > - patch Thanks for the updates! `make run-test TEST=test/hotspot/jtreg/compiler/c2 JTREG="VM_OPTIONS=-XX:AllocatePrefetchStyle=3"` has passed on PPC64. I agree, the general AllocatePrefetchStyle==3 topic can be discussed separately. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28240#pullrequestreview-3471900133 From epeter at openjdk.org Mon Nov 17 09:55:06 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 17 Nov 2025 09:55:06 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> Message-ID: On Mon, 17 Nov 2025 09:40:29 GMT, Hamlin Li wrote: >> test/hotspot/jtreg/compiler/c2/irTests/TestScalarConditionalMoveCmpObj.java line 131: >> >>> 129: // applyIf = {"UseCompressedOops", "false"}) >>> 130: // @IR(counts = {IRNode.CMOVE_L, ">0", IRNode.CMP_N, ">0"}, >>> 131: // applyIf = {"UseCompressedOops", "true"}) >> >> Do you plan to still do this in this PR? Probably a future RFE would be better. It could be nice if you could link to the RFE with the issue number from this comment. > > In this PR, no, this one will only implement CMoveF/D and enable the vectorization of CMoveF/D, so do some preparation for https://github.com/openjdk/jdk/pull/28231. > To guarantee the generation of CMoveI/L, seems to me we need to improve the cost model when transfrom a phi to a conditional move. I can have a invetigation later, as this impact how & whether CMoveL/I can be generated and be vectorized accordingly. File https://bugs.openjdk.org/browse/JDK-8371984 to track it. Ok. Sounds good. Just note: getting the cost model right here can be really difficult. People have played with the cost model in recent years, and it has also led to regressions in some cases. Just FYI, I'm not stopping you from trying if you like ;) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2533420545 From mli at openjdk.org Mon Nov 17 10:27:06 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 10:27:06 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> Message-ID: On Mon, 17 Nov 2025 09:51:39 GMT, Emanuel Peter wrote: >> In this PR, no, this one will only implement CMoveF/D and enable the vectorization of CMoveF/D, so do some preparation for https://github.com/openjdk/jdk/pull/28231. >> To guarantee the generation of CMoveI/L, seems to me we need to improve the cost model when transfrom a phi to a conditional move. I can have a invetigation later, as this impact how & whether CMoveL/I can be generated and be vectorized accordingly. File https://bugs.openjdk.org/browse/JDK-8371984 to track it. > > Ok. Sounds good. Just note: getting the cost model right here can be really difficult. People have played with the cost model in recent years, and it has also led to regressions in some cases. Just FYI, I'm not stopping you from trying if you like ;) Thanks for reminding! :) That's also the reason I won't do it in this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2533522249 From kevinw at openjdk.org Mon Nov 17 10:45:34 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 17 Nov 2025 10:45:34 GMT Subject: RFR: 8368527: JMX: Add an MXBeans method to query GC CPU time [v12] In-Reply-To: References: Message-ID: <4qjsyzSPqc8GERYbwMkCVTvoa547lAHi2in0d1MotCY=.7f7d0fcd-bc20-49e6-833a-3ec2404006f8@github.com> On Fri, 14 Nov 2025 21:21:16 GMT, Jonas Norlinder wrote: >> Hi all, >> >> This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. >> >> `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. >> >> FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. >> >> Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. > > Jonas Norlinder has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Reduce GC coverage for trivial API test" > > This reverts commit 8fd1ee093066138c9aa5602dcac0e7db1916db6b. Marked as reviewed by kevinw (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27537#pullrequestreview-3472086274 From duke at openjdk.org Mon Nov 17 10:45:36 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Mon, 17 Nov 2025 10:45:36 GMT Subject: Integrated: 8368527: JMX: Add an MXBeans method to query GC CPU time In-Reply-To: References: Message-ID: On Sat, 27 Sep 2025 11:18:58 GMT, Jonas Norlinder wrote: > Hi all, > > This PR augments the CPU time sampling measurement capabilities that a user can perform from Java code with the addition of `MemoryMXBean.getGcCpuTime()`. With this patch it will be possible for a user to measure process and GC CPU time during critical section or iterations in benchmarks to name a few. This new method complements the existing `OperatingSystemMXBean.getProcessCpuTime()` for a refined understanding. > > `CollectedHeap::gc_threads_do` may operate on terminated GC threads during shutdown, but thanks to JDK-8366865 by @walulyai we can piggyback on the new `Universe::is_shutting_down`. I have implemented a stress-test `test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java` that may identify reading CPU time of terminated threads. Synchronizing on `Universe::is_shutting_down` and `Heap_lock` resolves this problem. > > FWIW; To my understanding we don't want to add a `Universe::is_shutting_down` check in gc_threads_do as this may introduce a performance penalty that is unacceptable, therefore we must be careful about the few places where external users call upon gc_threads_do and may race with a terminating VM. > > Tested: test/jdk/java/lang/management/MemoryMXBean/GetGcCpuTime.java, jdk/javax/management/mxbean hotspot/jtreg/vmTestbase/nsk/monitoring on Linux x64, Linux aarch64, Windows x64, macOS x64 and macOS aarch64 with release and fastdebug. This pull request has now been integrated. Changeset: 812add27 Author: Jonas Norlinder Committer: Kevin Walls URL: https://git.openjdk.org/jdk/commit/812add27abdc70bc52ca105bc9430494a6491ecd Stats: 305 lines in 12 files changed: 302 ins; 1 del; 2 mod 8368527: JMX: Add an MXBeans method to query GC CPU time Reviewed-by: phh, kevinw ------------- PR: https://git.openjdk.org/jdk/pull/27537 From alanb at openjdk.org Mon Nov 17 10:48:38 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 17 Nov 2025 10:48:38 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v14] In-Reply-To: References: Message-ID: <33vXUyBAxy-_mh1VPp7hwz3K5GAur0YpkuzltVztiFU=.e2705104-44f7-4fdb-958c-aec66654ad7e@github.com> > Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). > > Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. > > HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). > > There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. > > Testing: tier1-6 Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 63 commits: - Merge branch 'master' into JDK-8353835 - Spurious italics - More wordsmithing - Improve IAE exception message - Merge branch 'master' into JDK-8353835 - Cleanup - More cleanup of Field.set API docs, including some restructure from Alex - Cleanup - Merge branch 'master' into JDK-8353835 - Update mutateFinals/modules test to exercise exports and opens cases - ... and 53 more: https://git.openjdk.org/jdk/compare/8690d263...c3c3cfff ------------- Changes: https://git.openjdk.org/jdk/pull/25115/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25115&range=13 Stats: 5365 lines in 76 files changed: 5170 ins; 55 del; 140 mod Patch: https://git.openjdk.org/jdk/pull/25115.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25115/head:pull/25115 PR: https://git.openjdk.org/jdk/pull/25115 From eastigeevich at openjdk.org Mon Nov 17 10:49:20 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 17 Nov 2025 10:49:20 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: <7OJuT0DovVEX2lEEqi368Jhsc8qLedTeZCqrepXchsE=.0ad1f26c-792c-4014-9a6d-e8675b844f01@github.com> On Thu, 13 Nov 2025 19:35:53 GMT, Erik ?sterlund wrote: > > @fisk , I'm assuming that no other thread is executing the target instructions while were patching them. > > Indeed; no concurrent thread is executing the instructions being modified. So, this confirms the redundancy of the `fence`, doesn't it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3541115900 From aartemov at openjdk.org Mon Nov 17 11:59:29 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Mon, 17 Nov 2025 11:59:29 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. > > Tested in tiers 1 - 5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8366671: Removed redundant include. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28264/files - new: https://git.openjdk.org/jdk/pull/28264/files/74cfcaea..e9866cdf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From aartemov at openjdk.org Mon Nov 17 11:59:33 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Mon, 17 Nov 2025 11:59:33 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 18:47:11 GMT, Coleen Phillimore wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Fixed build problem. > > src/hotspot/share/runtime/objectMonitor.hpp line 36: > >> 34: #include "utilities/checkedCast.hpp" >> 35: #include "utilities/globalDefinitions.hpp" >> 36: #include "utilities/spinCriticalSection.hpp" > > Include shouldn't be needed here. Correct, it was needed here in the presence of a Functor-derived inner class, but not it is not needed. > test/hotspot/gtest/jfr/test_adaptiveSampler.cpp line 43: > >> 41: #include "runtime/atomicAccess.hpp" >> 42: #include "utilities/globalDefinitions.hpp" >> 43: #include "utilities/spinCriticalSection.hpp" > > Why is this include needed here? For the same reason why `jfrSpinlockHelper.hpp` was included. It looks like two includes above that are redundant and can be removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2533778511 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2533783383 From duke at openjdk.org Mon Nov 17 12:14:07 2025 From: duke at openjdk.org (Ruben) Date: Mon, 17 Nov 2025 12:14:07 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: <-h6G9ajUWQwDRcUMOtyI_YCUCkXz3pzRggJk_UaxM-0=.a8c772aa-2f09-48c0-9cfb-17e624393eb0@github.com> References: <-h6G9ajUWQwDRcUMOtyI_YCUCkXz3pzRggJk_UaxM-0=.a8c772aa-2f09-48c0-9cfb-17e624393eb0@github.com> Message-ID: On Thu, 13 Nov 2025 07:54:32 GMT, Ruben wrote: > However, I still have not identified a way to ensure the deopt handler stub ends at a page boundary in a unit test. The latest update implements an alternative way to detect the failure early during testing - via the newly added assertion in the `emit_deopt_handler`. @adinn, @dean-long, @TheRealMDoerr, would it be possible for you to review the latest version of the PR? Is there any additional testing you would recommend to perform before this can be integrated? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3541484205 From eastigeevich at openjdk.org Mon Nov 17 12:29:08 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 17 Nov 2025 12:29:08 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. LGTM ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/28241#pullrequestreview-3472469212 From fandreuzzi at openjdk.org Mon Nov 17 13:11:32 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 13:11:32 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - remove elapsed. remove idle - Merge branch 'master' into JDK-8037914 - rename. start/end time - no start - enable - bytes to size - disable - revert - one event - trailing - ... and 5 more: https://git.openjdk.org/jdk/compare/84b50801...fc47a64e ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/090c02bc..fc47a64e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=05-06 Stats: 242359 lines in 1920 files changed: 155343 ins; 52311 del; 34705 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Mon Nov 17 13:11:32 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 13:11:32 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v6] In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 01:59:41 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > rename. start/end time Makes sense, thanks. I applied all your feedback in fc47a64e39712024c55542439bd775497a6d70ed. > An argument can be made that the phases should be separate events, similar to CompilerPhase and GCPausePhase, where you have a name for each phase (String Processing, Table Resize and Table Cleanup), but it may be over-engineering if we don't believe these phases will change in the future? Yeah I think there's little chance for changes there, I'd keep it as it is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3541742817 From stefank at openjdk.org Mon Nov 17 13:23:41 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 17 Nov 2025 13:23:41 GMT Subject: RFR: 8371990: Remove two second delayed OOME after GC shutdown Message-ID: In [JDK-8366865](https://bugs.openjdk.org/browse/JDK-8366865) the shutdown code was tweaked so that allocating code would try to block for two seconds and if the JVM didn't shut down within that time, an OOME was thrown from the allocating thread. One of the reason why this code was introduced was to deal with shutdown problem where the thread that were shutting down the JVM would first initiate the shutdown of the GC and *after* that the thread would call the JVMTI shutdown events and callbacks. The JVMTI callbacks could call arbitrary Java code that could try to allocate memory, and if the heap was filled up, it would have to wait for a GC to do its thing and hand back memory. But the GC had initiated its termination protocol and could be unresponsive to that request, which in term would lead to hanging JVM process. The problem described above was finally fixed with [JDK-8367902](https://bugs.openjdk.org/browse/JDK-8367902). So, I propose that we get rid of the workaround put into place with [JDK-8366865](https://bugs.openjdk.org/browse/JDK-8366865). The proposed patch restructures the GC shutdown a little bit. The idea is all threads that want to schedule a GC VM Operation already take the Heap_lock, and while holding that lock they check the `_is_shutting_down` variable. If the the JVM indeed is shutting down, the threads refuse to schedule the GC operation. Depending on the type of thread that is trying to schedule the GC operation we do one out of two things: 1) If it is a Java thread, we simply block the thread from running. The thread is either a daemon thread and the blocking of the thread will not hinder the shutdown. Or, the thread is a non-daemon thread but the Java code called System.halt, which doesn't wait for non-daemon threads. 2) If it is a Concurrent GC thread, then we let the thread proceed but with the order to skip the GC operation. This is done because the current shutdown code calls "stop" on the Concurrent GC threads and then wait for them to signal back when they have stopped running their code. So, we need to let them run to completion. There are some G1 specific details to look at: 1) I've reverted the G1 `concurrent_mark_is_terminating` checks. 2) `try_collect_concurrently` queries the `_is_shutting_down` while holding the lock, and then uses that queried value after the lock is released. 3) I've left some breadcrumbs in `should_clear_region`. Any suggestions on what to do with the comment and assert? This has been tested by running Oracle's tier1-tier8 tests. ------------- Commit messages: - Filter out ConcurrentGCThreads from GC operation shutdown - Revert VM_G1PauseConcurrent::doit_prologue - Sleep from concurrent thread - Move log_cpu_time back to Universe - Block for shutdown in GC safepoint prologue Changes: https://git.openjdk.org/jdk/pull/28349/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28349&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371990 Stats: 139 lines in 13 files changed: 45 ins; 61 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/28349.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28349/head:pull/28349 PR: https://git.openjdk.org/jdk/pull/28349 From mdoerr at openjdk.org Mon Nov 17 13:32:21 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 17 Nov 2025 13:32:21 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v2] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 23:49:28 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Add an assertion to detect out of bounds access in post-call NOP checks I think assertions would be sufficient in C1 instead of guarantee. But, ok. I'll put it into our test queue. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28192#pullrequestreview-3472719238 From mli at openjdk.org Mon Nov 17 13:35:15 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 13:35:15 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v3] In-Reply-To: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: > Hi, > > This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. > > This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. > > Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. > > # Test > ## Jtreg > > in progress... > > ## Performance > > Column names meanings: > * p: with patch > * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > * m: without patch > * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > > #### Average improvement > > NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. > > For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. > > Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) > -- | -- | -- | -- > 1.022782609 | 2.198717391 | 2.162673913 | 2.199 > > Hamlin Li has updated the pull request incrementally with four additional commits since the last revision: - remove TestScalarConditionalMove.java - merge scalar and vector tests - rename to TestConditionalMove.java - add CMP_N ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28309/files - new: https://git.openjdk.org/jdk/pull/28309/files/5c0d645d..51451ab5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=01-02 Stats: 10114 lines in 4 files changed: 3824 ins; 6290 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28309/head:pull/28309 PR: https://git.openjdk.org/jdk/pull/28309 From mli at openjdk.org Mon Nov 17 13:35:17 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 13:35:17 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v3] In-Reply-To: <46DWeMiCRNMC58wGr4T52KXbtRjU0PxQ4L6LuVFMZEo=.867fcc86-edd1-4492-9c1a-58f83d135969@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <5PzMJntiu2waMvciTLvXaUH15Fm3dXZPsDVvkuqWPI0=.68c6456a-e5d3-413e-bef8-d8da95de40bd@github.com> <46DWeMiCRNMC58wGr4T52KXbtRjU0PxQ4L6LuVFMZEo=.867fcc86-edd1-4492-9c1a-58f83d135969@github.com> Message-ID: On Mon, 17 Nov 2025 07:27:30 GMT, Emanuel Peter wrote: >> Make sense. I filed https://bugs.openjdk.org/browse/JDK-8371920 to track the task, will do it later after this pr. > > I would suggest that you already make the move from `@requires` to IR rule level restrictions. But we can look at adding `x64` and `aarch64` in the separate RFE. Merge of scalar and vector tests is done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2534089875 From mli at openjdk.org Mon Nov 17 13:42:18 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 13:42:18 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> Message-ID: On Mon, 17 Nov 2025 06:37:15 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - add CMove+CmpP/N tests >> - fix cmovF/D_cmpP > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2133: > >> 2131: break; >> 2132: case BoolTest::ge: >> 2133: assert(false, "Should go to BoolTest::le case"); > > I am not sure if it's safe to have these assertions for `ge` and `gt`. It seems to me that we should handle all possible condition codes here. Check this bug: https://bugs.openjdk.org/browse/JDK-8358892. We have added handling for `ge` and `gt` in `C2_MacroAssembler::enc_cmove_cmp_fp` to fix it. Make sense! Thanks! I'll add the implementation for these condition codes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2534113207 From mli at openjdk.org Mon Nov 17 13:42:20 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 13:42:20 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v3] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <5PzMJntiu2waMvciTLvXaUH15Fm3dXZPsDVvkuqWPI0=.68c6456a-e5d3-413e-bef8-d8da95de40bd@github.com> Message-ID: <0aUmOv2i1H2WJDoQV1Uirgof7C42vvPSyY73giIsKcs=.ad6b18f6-36c4-46fc-b26d-dec8d519c535@github.com> On Mon, 17 Nov 2025 07:43:02 GMT, Emanuel Peter wrote: >> Good idea! >> I can do it. What do you think about the name of the merged tests? `TestConditionalMove.java` or `TestScalarAndVectorConditionalMove.java` > > `TestConditionalMove.java` sounds good :) > > It would also be nice if we could move it out of the `irTests` directory, we would like to eventually move all tests away from it, and rather sort the tests by what they test and not by how we test them. Though now it's a little tricky because we check for both vector and scalar things. Still, I would propose that you move it under `c2/vectorization` or `c2/loopopts/superword`, since they do include vectorization tests. An alternative could also be in a new `c2/cmove` directory. I can do the move for this specific file at the last commit of this pr. Or we can move a bunch of tests (some other tests under irTests) in a separate pr, as there are `Asserts` in other tests under `irTests`. I prefer the latter one, as it put related changes in one specific pr. Plesae let me know how you think about it. :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2534108304 From egahlin at openjdk.org Mon Nov 17 14:30:10 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 17 Nov 2025 14:30:10 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 13:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - remove elapsed. remove idle > - Merge branch 'master' into JDK-8037914 > - rename. start/end time > - no start > - enable > - bytes to size > - disable > - revert > - one event > - trailing > - ... and 5 more: https://git.openjdk.org/jdk/compare/5fa84676...fc47a64e >From a JFR perspective, this looks good. Ideally, the values of the event should be sanity-checked, but I understand this might be tricky to do in a reliable manner. Hunting down false positives would just be a waste of time. The copyright year of the test should be 2025. It would be good if someone on the GC team could take a look at the GC-related code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3542131222 From fjiang at openjdk.org Mon Nov 17 14:38:11 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Mon, 17 Nov 2025 14:38:11 GMT Subject: RFR: 8371966: RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: <3DDmaCAiN5vez4G07I3U1lDWrAUK8WrKUa1EZVQEJn4=.9f0ee8cf-2ba0-4862-a6c0-bed71b68f347@github.com> On Mon, 17 Nov 2025 02:49:52 GMT, Anjian Wen wrote: > Do the same fix as aarch64 in `TemplateInterpreterGenerator::generate_native_entry()` [JDK-8371918](https://bugs.openjdk.org/browse/JDK-8371918) Looks good! ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/28343#pullrequestreview-3473020631 From fandreuzzi at openjdk.org Mon Nov 17 14:45:29 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 14:45:29 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: fix year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/fc47a64e..40829ead Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Mon Nov 17 14:45:30 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 14:45:30 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:27:24 GMT, Erik Gahlin wrote: > It would be good if someone on the GC team could take a look at the GC-related code. @albertnetymk could you have a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3542207837 From mli at openjdk.org Mon Nov 17 16:40:53 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 16:40:53 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v4] In-Reply-To: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: <0VUf9cNuKR6nc_V-Z2ylwW5YpmO13QUEBoDuQcctdCg=.3041a5fa-baf1-41f5-a271-854d68720fd8@github.com> > Hi, > > This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. > > This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. > > Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. > > # Test > ## Jtreg > > in progress... > > ## Performance > > Column names meanings: > * p: with patch > * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > * m: without patch > * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > > #### Average improvement > > NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. > > For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. > > Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) > -- | -- | -- | -- > 1.022782609 | 2.198717391 | 2.162673913 | 2.199 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: add BoolTest::ge/gt code and tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28309/files - new: https://git.openjdk.org/jdk/pull/28309/files/51451ab5..cf9168a2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=02-03 Stats: 1159 lines in 4 files changed: 968 ins; 4 del; 187 mod Patch: https://git.openjdk.org/jdk/pull/28309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28309/head:pull/28309 PR: https://git.openjdk.org/jdk/pull/28309 From mli at openjdk.org Mon Nov 17 16:47:05 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 17 Nov 2025 16:47:05 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v2] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <8Y3gUUVCNU1ZpfRkZeJqgIUomP6NCDIQqqgN-lRgk5A=.60177ffe-52ba-46de-a099-57d73f096a49@github.com> Message-ID: <4K5xXIGM3anJGkUHGJ75fs6X-zfM_aDNI6Bi9yifK4c=.bb898013-6dbc-4e9a-8666-e8858f87d93f@github.com> On Mon, 17 Nov 2025 13:38:40 GMT, Hamlin Li wrote: >> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2133: >> >>> 2131: break; >>> 2132: case BoolTest::ge: >>> 2133: assert(false, "Should go to BoolTest::le case"); >> >> I am not sure if it's safe to have these assertions for `ge` and `gt`. It seems to me that we should handle all possible condition codes here. Check this bug: https://bugs.openjdk.org/browse/JDK-8358892. We have added handling for `ge` and `gt` in `C2_MacroAssembler::enc_cmove_cmp_fp` to fix it. > > Make sense! Thanks! > I'll add the implementation for these condition codes. I added some code and tests. But the code path for `ge`/`gt` can not be triggerred (I added some new test based on previous tests added in https://bugs.openjdk.org/browse/JDK-8358892). So for now, I think it's safer for us to keep the `assert`, in this way, in the future when we get it triggerred by some code we can compse a jtreg test and fix it. How do you think about it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2534803387 From sgehwolf at openjdk.org Mon Nov 17 17:34:04 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 17 Nov 2025 17:34:04 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> Message-ID: On Mon, 27 Oct 2025 09:17:02 GMT, Andrew Haley wrote: >>> > I agree. Its not pretty, but consistent with what we did elsewhere. Nobody wants to do that discussion again. >>> >>> Sorry, I was unaware of any previous discussion. I was suggesting a less impactful way to make the change, taking advantage of the recent adoption of C++17, which allows for cleaner code. But I won't stand in the way of consensus. >> >> FWIW, I'd be interested in seeing a small example of what that would look like with C++17. There were a lot of discussion about the style, but it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > >> it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > > This. A function that returns its value as a side effect on a reference parameter is (at best) a code smell. @theRealAph @stefank OK to integrate this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3543074389 From liach at openjdk.org Mon Nov 17 18:02:13 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 17 Nov 2025 18:02:13 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v14] In-Reply-To: <33vXUyBAxy-_mh1VPp7hwz3K5GAur0YpkuzltVztiFU=.e2705104-44f7-4fdb-958c-aec66654ad7e@github.com> References: <33vXUyBAxy-_mh1VPp7hwz3K5GAur0YpkuzltVztiFU=.e2705104-44f7-4fdb-958c-aec66654ad7e@github.com> Message-ID: On Mon, 17 Nov 2025 10:48:38 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 63 commits: > > - Merge branch 'master' into JDK-8353835 > - Spurious italics > - More wordsmithing > - Improve IAE exception message > - Merge branch 'master' into JDK-8353835 > - Cleanup > - More cleanup of Field.set API docs, including some restructure from Alex > - Cleanup > - Merge branch 'master' into JDK-8353835 > - Update mutateFinals/modules test to exercise exports and opens cases > - ... and 53 more: https://git.openjdk.org/jdk/compare/8690d263...c3c3cfff The core library changes look good to me. ------------- Marked as reviewed by liach (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25115#pullrequestreview-3473937951 From xuelei at openjdk.org Mon Nov 17 18:27:26 2025 From: xuelei at openjdk.org (Xue-Lei Andrew Fan) Date: Mon, 17 Nov 2025 18:27:26 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v18] In-Reply-To: <4O9v08uY1viSeMh_w821RNfKj67p74y2PqDrB8GdZCs=.e21a3d53-4a00-4f4a-99dc-589b1044d7bd@github.com> References: <4O9v08uY1viSeMh_w821RNfKj67p74y2PqDrB8GdZCs=.e21a3d53-4a00-4f4a-99dc-589b1044d7bd@github.com> Message-ID: On Fri, 7 Nov 2025 15:25:49 GMT, Erik ?sterlund wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove -server in test for static GHA build > > Thank you for the reviews everyone! @fisk Is there any chance to backport this update to 25? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27732#issuecomment-3543306783 From ayang at openjdk.org Mon Nov 17 18:55:08 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 17 Nov 2025 18:55:08 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:45:29 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > fix year Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28015#pullrequestreview-3474115864 From egahlin at openjdk.org Mon Nov 17 18:55:09 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 17 Nov 2025 18:55:09 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: <_StovLFzWCTY3tLarfVFee0vcsPLgcYjrzL0Xq-7n2A=.304d6035-496f-43d8-9a8e-06f159c78e8c@github.com> On Mon, 17 Nov 2025 14:45:29 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > fix year Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28015#pullrequestreview-3474120613 From eosterlund at openjdk.org Mon Nov 17 19:02:27 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 17 Nov 2025 19:02:27 GMT Subject: RFR: 8365932: Implementation of JEP 516: Ahead-of-Time Object Caching with Any GC [v18] In-Reply-To: <4O9v08uY1viSeMh_w821RNfKj67p74y2PqDrB8GdZCs=.e21a3d53-4a00-4f4a-99dc-589b1044d7bd@github.com> References: <4O9v08uY1viSeMh_w821RNfKj67p74y2PqDrB8GdZCs=.e21a3d53-4a00-4f4a-99dc-589b1044d7bd@github.com> Message-ID: On Fri, 7 Nov 2025 15:25:49 GMT, Erik ?sterlund wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove -server in test for static GHA build > > Thank you for the reviews everyone! > @fisk Is there any chance to backport this update to 25? Unfortunately, we generally do not backport JEPs. But the next LTS is just around the corner, in the grand scheme of things. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27732#issuecomment-3543420953 From stefank at openjdk.org Mon Nov 17 20:03:15 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 17 Nov 2025 20:03:15 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> Message-ID: On Mon, 27 Oct 2025 08:33:22 GMT, Stefan Karlsson wrote: >>> >>> I agree. Its not pretty, but consistent with what we did elsewhere. Nobody wants to do that discussion again. >> >> Sorry, I was unaware of any previous discussion. I was suggesting a less impactful way to make the change, taking advantage of the recent adoption of C++17, which allows for cleaner code. But I won't stand in the way of consensus. > >> > I agree. Its not pretty, but consistent with what we did elsewhere. Nobody wants to do that discussion again. >> >> Sorry, I was unaware of any previous discussion. I was suggesting a less impactful way to make the change, taking advantage of the recent adoption of C++17, which allows for cleaner code. But I won't stand in the way of consensus. > > FWIW, I'd be interested in seeing a small example of what that would look like with C++17. There were a lot of discussion about the style, but it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > @stefank OK to integrate this? Yes. I took a quick glance at the changes and it looks like the previous style that was made for the other os:: memory APIs. I'm deferring the responsibility to do a full Review to Casper and Thomas. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3543621156 From kurt at openjdk.org Mon Nov 17 20:35:11 2025 From: kurt at openjdk.org (Kurt Miller) Date: Mon, 17 Nov 2025 20:35:11 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 11:39:10 GMT, Andrew Haley wrote: >> ?rGenerator::generate_native_entry >> >> I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: >> >> >> // get native function entry point in r10 >> { >> Label L; >> __ ldr(r10, Address(rmethod, Method::native_function_offset())); >> ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); >> __ lea(rscratch2, unsatisfied); >> __ ldr(rscratch2, rscratch2); >> __ cmp(r10, rscratch2); >> __ br(Assembler::NE, L); >> __ call_VM(noreg, >> CAST_FROM_FN_PTR(address, >> InterpreterRuntime::prepare_native_call), >> rmethod); >> __ get_method(rmethod); >> __ ldr(r10, Address(rmethod, Method::native_function_offset())); >> __ bind(L); >> } >> >> >> If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. >> >> This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. >> >> This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. >> >> Updated comment with markdown for code. > > Ah yes, I see. > > The mistake was mine: I thought `__ cmpptr(rax, unsatisfied.addr(), rscratch1)` meant > `__ cmpptr(rax, unsatisfied, rscratch1)`. In other words, I missed the significance of `.addr()`. @theRealAph Thank you for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3543720726 From duke at openjdk.org Mon Nov 17 20:35:12 2025 From: duke at openjdk.org (duke) Date: Mon, 17 Nov 2025 20:35:12 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: <_Jpmzy-4iPdWd6V5W4tf7ymu1VPVxcbtdIcD_RcOxEo=.0d5688fb-5a92-4236-9cc9-faa0d9811e27@github.com> On Fri, 14 Nov 2025 17:41:56 GMT, Kurt Miller wrote: > ?rGenerator::generate_native_entry > > I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: > > > // get native function entry point in r10 > { > Label L; > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); > __ lea(rscratch2, unsatisfied); > __ ldr(rscratch2, rscratch2); > __ cmp(r10, rscratch2); > __ br(Assembler::NE, L); > __ call_VM(noreg, > CAST_FROM_FN_PTR(address, > InterpreterRuntime::prepare_native_call), > rmethod); > __ get_method(rmethod); > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > __ bind(L); > } > > > If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. > > This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. > > This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. > > Updated comment with markdown for code. @bsdkurt Your change (at version e5c2e609436b7ebb8a05143e92cfe4dfe820c2ae) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3543723473 From erikj at openjdk.org Mon Nov 17 20:45:04 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Mon, 17 Nov 2025 20:45:04 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v3] In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 01:10:40 GMT, Kim Barrett wrote: >> 8369187: Add wrapper for that forbids use of global allocation and deallocation functions >> >> Please review this change that adds `cppstdlib/new.hpp` as a wrapper for >> including ``. All existing inclusions of `` are changed to include >> the new wrapper. >> >> In additional to including ``, this wrapper also provides deprecation >> declarations to prevent the use of some facilities by HotSpot code. >> >> However, those deprecations need to be conditionalized to not apply to gtests, >> so this change also adds a macro definition provided by the build system for >> use in detecting that a header is being included by a gtest. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into wrap-stdlib-new > - poison implicit alloc/dealloc in globalDefinitions > - Merge branch 'master' into wrap-stdlib-new > - further conditionalize deprecation of hardare interference sizes > - add wrapper for Build change looks ok. ------------- Marked as reviewed by erikj (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28250#pullrequestreview-3474488290 From jkratochvil at openjdk.org Mon Nov 17 21:22:09 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 17 Nov 2025 21:22:09 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v13] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 06:32:09 GMT, Kim Barrett wrote: >> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: >> >> Add Ioi Lam's comment > > src/hotspot/share/oops/resolvedFieldEntry.cpp line 29: > >> 27: #include "oops/resolvedFieldEntry.hpp" >> 28: >> 29: STATIC_ASSERT(std::is_trivially_copyable_v == true); > > Style nit: `STATIC_ASSERT` shouldn't be used anymore. C++17 gives us 1-arg `static_assert`. > Also, explicit comparison to `true` is weird. This is a copy-paste from @iklam's: https://github.com/openjdk/jdk/pull/28172/files Last `STATIC_ASSERT` was checked in less than 2 months ago: https://github.com/openjdk/jdk/pull/27152/files#diff-d8b70800fb68e0478dd0936c7f4a08b1bb59ce7a276ad1140355933c246372caR285 It should be rather rejected by CI (but yes, I could also do that). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26098#discussion_r2535542261 From jkratochvil at openjdk.org Mon Nov 17 21:28:58 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 17 Nov 2025 21:28:58 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v14] In-Reply-To: References: Message-ID: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: update STATIC_ASSERT->static_assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26098/files - new: https://git.openjdk.org/jdk/pull/26098/files/ef3673ca..f85f1066 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26098&range=12-13 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26098/head:pull/26098 PR: https://git.openjdk.org/jdk/pull/26098 From pchilanomate at openjdk.org Mon Nov 17 21:53:04 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 17 Nov 2025 21:53:04 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent Message-ID: When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence would be to place extra overhead on the thread requesting to disable transitions (e.g. by using a safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so I believe this approach is simpler. - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. - The code was structured in terms of mount and unmount cases, and a variable was used to differentiate between start or end of the transition. With the changes to make the mechanism independent of JVMTI it becomes simpler to invert this and structure the code in terms of start transition and end transition, and use a variable to differentiate between mount and unmount cases. - All JVMTI code required during start/end transitions has been encapsulated in classes `JVMTIStartTransition` and `JVMTIEndTransition`. I kept the ordering of event posting as it is today. - Global variables `_sync_protocol_enabled_count` and `_sync_protocol_enabled_permanently` were removed. Variable `_VTMS_transition_disable_for_all_count` was renamed to `_global_start_transition_disable_count`, `_SR_mode` to `_exclusive_operation_ongoing` and `_VTMS_notify_jvmti_events` to `_notify_jvmti_events`. New global variable `_active_disablers` replaces the functionality of `_VTMS_transition_disable_for_one_count`. - Now, when the first agent attaches we not only set `_notify_jvmti_events` but we also increase global counter `_global_start_transition_disable_count`. This has the effect of always forcing the slow path when starting and ending a transition as we do today when `_VTMS_notify_jvmti_events` is set. A new `Handshake::execute` variant to handshake a virtual thread is introduced with this patch, which makes use of the new `MountUnmountDisabler` class. Method `ThreadSnapshotFactory::get_thread_snapshot` has been simplified to use this handshake variant to capture the snapshot of a virtual thread. The changes include new test `DumpThreadsWhenParking.java` from @AlanBateman which reliably reproduces the issue. I also verified the changes in Mach5 tiers1-7. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/28361/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364343 Stats: 1848 lines in 40 files changed: 844 ins; 824 del; 180 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From kbarrett at openjdk.org Mon Nov 17 22:52:18 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 17 Nov 2025 22:52:18 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 11:59:29 GMT, Anton Artemov wrote: >> Hi, >> >> please consider the following changes: >> >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. >> >> Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. >> >> Tested in tiers 1 - 5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8366671: Removed redundant include. Changes requested by kbarrett (Reviewer). src/hotspot/share/runtime/park.cpp line 66: > 64: { > 65: SpinCriticalSection scs(&ListLock); > 66: { This extra level of scoping doesn't seem needed. src/hotspot/share/utilities/spinCriticalSection.cpp line 38: > 36: } > 37: > 38: // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. Use "utilities/spinYield.hpp"? src/hotspot/share/utilities/spinCriticalSection.hpp line 30: > 28: #include "runtime/javaThread.hpp" > 29: > 30: class SpinCriticalSectionHelper { Derive from `AllStatic` ("memory/allStatic.hpp"). src/hotspot/share/utilities/spinCriticalSection.hpp line 42: > 40: > 41: // Short critical section. To be used when having a > 42: // mutex is considered to be expensive. I think this is a really poor description, as I think it will encourage the use of these facilities in inappropriate places. Spin-lock usage ought to be pretty rare! Note that the existing mechanism is described as "Not for general synchronization use." I think better motivation is needed. Note I'm not suggesting that doesn't exist, rather than motivation and usage guidelines should be documented here. The comment for SpinCriticalSectionHelper in the .cpp file is more the kind of thing I'm looking for. I shouldn't have to look at that internal helper's implementation to find such guidance. src/hotspot/share/utilities/spinCriticalSection.hpp line 45: > 43: class SpinCriticalSection { > 44: private: > 45: volatile int* const _lock; Use new `Atomic` rather than introducing new direct uses of `AtomicAccess`. src/hotspot/share/utilities/spinCriticalSection.hpp line 53: > 51: SpinCriticalSectionHelper::spin_release(_lock); > 52: } > 53: }; Should be noncopyable. src/hotspot/share/utilities/spinCriticalSection.hpp line 55: > 53: }; > 54: > 55: template I'd prefer not to name the first argument "Lambda", since it might not be one. I would prefer `F` or `Fn` or something like that. And there should be some documentation for this class, including a description of the template parameters and their requirements. src/hotspot/share/utilities/spinCriticalSection.hpp line 56: > 54: > 55: template > 56: class SpinSingleSection { Consider giving this class template a deduction guide. That will likely make uses _much_ simpler, removing the explicit template parameters in variable declarations and just letting them be deduced from the constructor argument types. src/hotspot/share/utilities/spinCriticalSection.hpp line 56: > 54: > 55: template > 56: class SpinSingleSection { Although I've made some suggestions for possibly improving SpinSingleSection, I'm not sure it's a good idea as a concept. It seems to be attempting to provide a conditional critical section, but is doing so in what seems to me to be a weird way. As provided, it first conditionally executes a funarg under the lock, if it can acquire the lock. It then permits an external body (the scope of the section) to execute either under or not under the lock (depending on whether it was successfully acquired), with no way to know which state we're in. I think an API more similar to `std::unique_lock` for SpinCriticalSection would be better. `std::unique_lock` provides a `owns_lock()` function and a constructor overload taking a `std::try_to_lock_t` value. This controls whether the locking should be conditional or not, and a way for the using code to detect success/failure to lock in the conditional case. This doesn't have to be in one class though. There could be two critical section classes, one unconditional and one conditional, with only the latter providing the success/failure info. src/hotspot/share/utilities/spinCriticalSection.hpp line 58: > 56: class SpinSingleSection { > 57: private: > 58: volatile int* const _lock; Why an `int`-type value for the lock, rather than `bool`? I know why, but it should probably be stated explicitly, else someone might be tempted to change it in the future. src/hotspot/share/utilities/spinCriticalSection.hpp line 61: > 59: Thread* _lock_owner; > 60: public: > 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { `F` => `f` - variables have lower-case names. src/hotspot/share/utilities/spinCriticalSection.hpp line 61: > 59: Thread* _lock_owner; > 60: public: > 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { I really need to get on the ball and update the style guide regarding at least forwarding references and `std::forward`. src/hotspot/share/utilities/spinCriticalSection.hpp line 61: > 59: Thread* _lock_owner; > 60: public: > 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { Taking the function argument by reference prevents certain common use-cases, e.g. I think this prevents passing an anonymous lambda. src/hotspot/share/utilities/spinCriticalSection.hpp line 63: > 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { > 62: if (SpinCriticalSectionHelper::try_spin_acquire(_lock)) { > 63: _lock_owner = Thread::current(); Why do we need the owning thread here? It seems like a bool "lock acquired" value would be sufficient. src/hotspot/share/utilities/spinCriticalSection.hpp line 75: > 73: SpinCriticalSectionHelper::spin_release(_lock); > 74: } > 75: } Should be noncopyable. src/hotspot/share/utilities/spinCriticalSection.hpp line 77: > 75: } > 76: }; > 77: #endif //SHARE_UTILITIES_SPINCRITICALSECTION_HPP We usually put a blank line before the `#endif` of the include guard. Also a space after `//`. ------------- PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3474550894 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535649044 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535596087 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535672539 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535572951 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535511031 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535682627 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535516604 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535520250 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535744012 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535609870 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535528890 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535535461 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535634603 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535741931 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535683931 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535668579 From kbarrett at openjdk.org Mon Nov 17 22:52:21 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 17 Nov 2025 22:52:21 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 11:56:19 GMT, Anton Artemov wrote: >> test/hotspot/gtest/jfr/test_adaptiveSampler.cpp line 43: >> >>> 41: #include "runtime/atomicAccess.hpp" >>> 42: #include "utilities/globalDefinitions.hpp" >>> 43: #include "utilities/spinCriticalSection.hpp" >> >> Why is this include needed here? > > For the same reason why `jfrSpinlockHelper.hpp` was included. > > It looks like the two includes above that are redundant and can be removed. This one cannot, it breaks builds. Include of "atomicAccess.hpp" seems unnecessary, as there are no (direct) uses here. "globalDefinitions.hpp" should not be removed, under the "Include What You Use" guidance (which hasn't yet made it into the Style Guide - https://bugs.openjdk.org/browse/JDK-8252896). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535663096 From vpaprotski at openjdk.org Mon Nov 17 23:35:44 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 17 Nov 2025 23:35:44 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v2] In-Reply-To: References: Message-ID: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" Volodymyr Paprotski has updated the pull request incrementally with two additional commits since the last revision: - whitespace - address first comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28136/files - new: https://git.openjdk.org/jdk/pull/28136/files/6d3f7794..e9133401 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=00-01 Stats: 42 lines in 5 files changed: 17 ins; 15 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/28136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28136/head:pull/28136 PR: https://git.openjdk.org/jdk/pull/28136 From vpaprotski at openjdk.org Mon Nov 17 23:35:45 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 17 Nov 2025 23:35:45 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v2] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 06:44:39 GMT, Jatin Bhateja wrote: >> Volodymyr Paprotski has updated the pull request incrementally with two additional commits since the last revision: >> >> - whitespace >> - address first comments > > src/hotspot/cpu/x86/assembler_x86.cpp line 3867: > >> 3865: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : >> 3866: (vector_len == AVX_512bit ? VM_Version::supports_evex() : false)), ""); >> 3867: InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true); > > When you check for AVX512-VL you allow accessing 128/256 bit registers from the higher register bank [X/Y]MM(16-31) > > But your assertions are nowhere checking this. I believe those asserts are in `vex_prefix_and_encode` (https://github.com/openjdk/jdk/blob/6d3f7794ee6658d48eb2120c7bfe66ac412c6d14/src/hotspot/cpu/x86/assembler_x86.cpp#L13164) and `vex_prefix` (https://github.com/openjdk/jdk/blob/6d3f7794ee6658d48eb2120c7bfe66ac412c6d14/src/hotspot/cpu/x86/assembler_x86.cpp#L13047) I also haven't found any other instruction that does this check so I could emulate the style. > src/hotspot/cpu/x86/assembler_x86.cpp line 3882: > >> 3880: >> 3881: void Assembler::evmovsldup(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) { >> 3882: assert(VM_Version::supports_evex(), ""); > > Suggestion: > > assert(vector_len == AVX_512 || VM_Version::supports_avx512vl), ""); Took the patch, but also kept the supports_evex() assert > test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 114: > >> 112: rnd.setSeed(seed); >> 113: //Note: it might be useful to increase this number during development of new intrinsics >> 114: final int repeat = 10000000; > > Instead of high repetition count can you try tuning the tiered compilation threshold. The purpose of the test is to test various (pseudo-random) values and compare the results to the java implementation of same code. A single run-though of the test doesn't always prove that there are no bugs. A bit philosophical.. as is well known, when writing crypto, branches (conditional on secret) are disallowed; but e.g. carry propagation has the same 'conditional execution' effect. (Instead of "have you tested every branch direction" its "have you tested every carry") Besides a very careful range/overflow analysis (which I also did.. ntt functions skate very close to the int limit), exhaustive fuzz testing is the best method to find conditions that manual (range/overflow) analysis hasn't found; fuzz testing has very little math built in, so its also good at finding 'blind spots' I (and whomever has to review) might have not thought of.. > test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 145: > >> 143: coeffs1[j] = rnd.nextInt(); >> 144: coeffs2[j] = rnd.nextInt(); >> 145: } > > You can uses generators for randome initialization of array I think you meant this? coeffs1 = rnd.ints(ML_DSA_N).toArray(); coeffs2 = rnd.ints(ML_DSA_N).toArray(); Didn't know about this, thanks. It does work.. But the original purpose (perhaps misguided, but its done) was to 'factor out' the allocations; the outer loop runs many million times (I've left it running for 6+hours during development) and so I wanted a 'somewhat efficient' test. In hindsight, these (1k) arrays could probably be stack allocated, but I did not want to depend on an optimization when I could just write it without allocations in the mainline ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2535460279 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2535804056 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2535373444 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2535199249 From vpaprotski at openjdk.org Mon Nov 17 23:35:45 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 17 Nov 2025 23:35:45 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v2] In-Reply-To: <_TeZd3joeNkWYg7ZOgYRwzRJJjwMcUVOfe-pdXzJTv4=.d413a241-c8de-4267-8b98-0b41c7629371@github.com> References: <_TeZd3joeNkWYg7ZOgYRwzRJJjwMcUVOfe-pdXzJTv4=.d413a241-c8de-4267-8b98-0b41c7629371@github.com> Message-ID: On Sun, 16 Nov 2025 16:47:29 GMT, Mark Powers wrote: >> Volodymyr Paprotski has updated the pull request incrementally with two additional commits since the last revision: >> >> - whitespace >> - address first comments > > test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 123: > >> 121: try { >> 122: for (int i = 0; i < repeat; i++) { >> 123: // seed = rnd.nextLong(); > > 2 lines commented out This was useful during development and might be useful hint for debugging; instead of deleting, added a comment. Let me know if that works > test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 517: > >> 515: }; >> 516: } >> 517: // java --add-opens java.base/sun.security.provider=ALL-UNNAMED -XX:+UseDilithiumIntrinsics test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java > > This is line is useful. Not sure I would hide it at the bottom of the file. I actually meant to delete it, but will move it to the top. > test/micro/org/openjdk/bench/javax/crypto/full/MLDSABench.java line 2: > >> 1: /* >> 2: * Copyright (c) 2015, 2018, Oracle and/or its affiliates. All rights reserved. > > Copyright date. That was some copy-paste! Thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2535377021 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2535082275 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2535078538 From kbarrett at openjdk.org Tue Nov 18 00:24:12 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 18 Nov 2025 00:24:12 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 22:49:04 GMT, Kim Barrett wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Removed redundant include. > > src/hotspot/share/utilities/spinCriticalSection.hpp line 56: > >> 54: >> 55: template >> 56: class SpinSingleSection { > > Although I've made some suggestions for possibly improving SpinSingleSection, > I'm not sure it's a good idea as a concept. It seems to be attempting to > provide a conditional critical section, but is doing so in what seems to me to > be a weird way. > > As provided, it first conditionally executes a funarg under the > lock, if it can acquire the lock. It then permits an external body (the scope > of the section) to execute either under or not under the lock (depending on > whether it was successfully acquired), with no way to know which state we're > in. > > I think an API more similar to `std::unique_lock` for SpinCriticalSection > would be better. `std::unique_lock` provides a `owns_lock()` function and a > constructor overload taking a `std::try_to_lock_t` value. This controls > whether the locking should be conditional or not, and a way for the using code > to detect success/failure to lock in the conditional case. This doesn't have > to be in one class though. There could be two critical section classes, one > unconditional and one conditional, with only the latter providing the > success/failure info. Or maybe just not bother with special help for the currently one(?) use-case for this, and instead have that use-case directly use `try_acquire` with a local RAII object to ensure release in the acquired case. Or a local bespoke helper class, or something along those lines. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2535907269 From wenanjian at openjdk.org Tue Nov 18 03:36:11 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Tue, 18 Nov 2025 03:36:11 GMT Subject: RFR: 8371966: RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: <5XIH1aOd84FnYAYPrzlByJH2iYmGnqSJLUDsS1C4RE0=.84a76596-bdab-44fd-8674-efe6e7c987f6@github.com> References: <5XIH1aOd84FnYAYPrzlByJH2iYmGnqSJLUDsS1C4RE0=.84a76596-bdab-44fd-8674-efe6e7c987f6@github.com> Message-ID: <52SzynDL9ZN5Ap1PIecQoV1I8wu_j3nFRkAFIfa6oNc=.2f0374c1-2d78-46a7-a018-943be2d9417e@github.com> On Mon, 17 Nov 2025 03:02:54 GMT, Fei Yang wrote: >> Do the same fix as aarch64 in `TemplateInterpreterGenerator::generate_native_entry()` [JDK-8371918](https://bugs.openjdk.org/browse/JDK-8371918) > > Thanks! @RealFYang @feilongjiang Thanks for your review and approve ------------- PR Comment: https://git.openjdk.org/jdk/pull/28343#issuecomment-3544871319 From duke at openjdk.org Tue Nov 18 03:36:12 2025 From: duke at openjdk.org (duke) Date: Tue, 18 Nov 2025 03:36:12 GMT Subject: RFR: 8371966: RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 02:49:52 GMT, Anjian Wen wrote: > Do the same fix as aarch64 in `TemplateInterpreterGenerator::generate_native_entry()` [JDK-8371918](https://bugs.openjdk.org/browse/JDK-8371918) @Anjian-Wen Your change (at version 33bb43491d7dba9a285231fb67580fc388882778) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28343#issuecomment-3544874242 From wenanjian at openjdk.org Tue Nov 18 03:40:24 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Tue, 18 Nov 2025 03:40:24 GMT Subject: Integrated: 8371966: RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 02:49:52 GMT, Anjian Wen wrote: > Do the same fix as aarch64 in `TemplateInterpreterGenerator::generate_native_entry()` [JDK-8371918](https://bugs.openjdk.org/browse/JDK-8371918) This pull request has now been integrated. Changeset: 695a4abd Author: Anjian Wen Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/695a4abd5f7e9edcea9f1a724a9ceb87340a8f25 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod 8371966: RISC-V: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry Reviewed-by: fyang, fjiang ------------- PR: https://git.openjdk.org/jdk/pull/28343 From iklam at openjdk.org Tue Nov 18 05:10:42 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 05:10:42 GMT Subject: RFR: 8372045: AOT assembly phase asserts with old class if AOT class linking is disabled Message-ID: Old classes should be stored in the AOT cache only if `CDSConfig::is_preserving_verification_constraints() == true`. However, we miss this check in the AOT assembly phase: the `this` class is loaded from the AOT configuration file, which is a special type of AOT cache, so `AOTMetaspace::in_aot_cache(this)` returns true: bool InstanceKlass::can_be_verified_at_dumptime() const { if (AOTMetaspace::in_aot_cache(this)) { // This is a class that was dumped into the base archive, so we know // it was verified at dump time. return true; } The fix is ``` bool InstanceKlass::can_be_verified_at_dumptime() const { if (CDSConfig::is_dumping_dynamic_archive() && AOTMetaspace::in_aot_cache(this)) { as this check is intended to be used only when dumping the dynamic archive. This bug was found when running a complex application (specJBB) but I created a simple reproducer (OldClassSupport2.java). ------------- Commit messages: - 8372045: AOT assembly phase asserts with old class if AOT class linking is disabled Changes: https://git.openjdk.org/jdk/pull/28365/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28365&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372045 Stats: 105 lines in 2 files changed: 104 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28365.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28365/head:pull/28365 PR: https://git.openjdk.org/jdk/pull/28365 From dholmes at openjdk.org Tue Nov 18 06:55:05 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Nov 2025 06:55:05 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 11:59:29 GMT, Anton Artemov wrote: >> Hi, >> >> please consider the following changes: >> >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. >> >> Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. >> >> Tested in tiers 1 - 5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8366671: Removed redundant include. I'm really not convinced about this one. First the "motivation" is mis-placed as (as Kim noted from the existing code) we do not want this to be a general purpose synchronization utility that encourages everyone to use it through the codebase. So really this is about creating a basic wrapper RAII class - as JFR did - but make it and the underlying API more generic by extracting from the JFR and Threads code. A RAII helper is nice to have but not essential for very short critical sections where we can't really miss a return path, but I'm almost swayed by that. I'm not at all convinced we need the template variant though. src/hotspot/share/runtime/objectMonitor.cpp line 320: > 318: check_object_context(); > 319: if (_object_strong.is_empty()) { > 320: auto setObjectStrongLambda = [&](OopHandle& object_strong, const WeakHandle& object) { I don't understand why we need the complexity of the `SpinSingleSection` and use of lambda's/functors. This seems like try-lock usage, though I'm not at all sure why (i.e. if we don't get the lock who is taking care of making a strong reference?) ------------- PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3475442011 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2536305734 From shade at openjdk.org Tue Nov 18 07:47:09 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 18 Nov 2025 07:47:09 GMT Subject: RFR: 8372045: AOT assembly phase asserts with old class if AOT class linking is disabled In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 05:05:09 GMT, Ioi Lam wrote: > Old classes should be stored in the AOT cache only if `CDSConfig::is_preserving_verification_constraints() == true`. However, we miss this check in the AOT assembly phase: the `this` class is loaded from the AOT configuration file, which is a special type of AOT cache, so `AOTMetaspace::in_aot_cache(this)` returns true: > > > bool InstanceKlass::can_be_verified_at_dumptime() const { > if (AOTMetaspace::in_aot_cache(this)) { > // This is a class that was dumped into the base archive, so we know > // it was verified at dump time. > return true; > } > > > The fix is > > ``` > bool InstanceKlass::can_be_verified_at_dumptime() const { > if (CDSConfig::is_dumping_dynamic_archive() && AOTMetaspace::in_aot_cache(this)) { > > > as this check is intended to be used only when dumping the dynamic archive. > > This bug was found when running a complex application (specJBB) but I created a simple reproducer (OldClassSupport2.java). Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28365#pullrequestreview-3475947563 From alanb at openjdk.org Tue Nov 18 08:04:13 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 18 Nov 2025 08:04:13 GMT Subject: RFR: 8353835: Implement JEP 500: Prepare to Make Final Mean Final [v14] In-Reply-To: <33vXUyBAxy-_mh1VPp7hwz3K5GAur0YpkuzltVztiFU=.e2705104-44f7-4fdb-958c-aec66654ad7e@github.com> References: <33vXUyBAxy-_mh1VPp7hwz3K5GAur0YpkuzltVztiFU=.e2705104-44f7-4fdb-958c-aec66654ad7e@github.com> Message-ID: On Mon, 17 Nov 2025 10:48:38 GMT, Alan Bateman wrote: >> Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). >> >> Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. >> >> HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). >> >> There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. >> >> Testing: tier1-6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 63 commits: > > - Merge branch 'master' into JDK-8353835 > - Spurious italics > - More wordsmithing > - Improve IAE exception message > - Merge branch 'master' into JDK-8353835 > - Cleanup > - More cleanup of Field.set API docs, including some restructure from Alex > - Cleanup > - Merge branch 'master' into JDK-8353835 > - Update mutateFinals/modules test to exercise exports and opens cases > - ... and 53 more: https://git.openjdk.org/jdk/compare/8690d263...c3c3cfff Thanks for the comments and detailed reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25115#issuecomment-3546036382 From alanb at openjdk.org Tue Nov 18 08:10:21 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 18 Nov 2025 08:10:21 GMT Subject: Integrated: 8353835: Implement JEP 500: Prepare to Make Final Mean Final In-Reply-To: References: Message-ID: On Thu, 8 May 2025 11:22:30 GMT, Alan Bateman wrote: > Implementation changes for [JEP 500: Prepare to Make Final Mean Final](https://openjdk.org/jeps/500). > > Field.set (and Lookup.unreflectSetter) are changed to allow/warn/debug/deny when mutating a final instance field. JFR event recorded if final field mutated. Spec updates to Field.set, Field.setAccessible and Module.addOpens to align with the proposal in the JEP. > > HotSpot is updated to add support for the new command line options. To aid diagnosability, -Xcheck:jni reports a warning and -Xlog:jni=debug logs a message to help identity JNI code that mutates finals. For now, JNI code is allowed to set the "write-protected" fields System.in/out/err without a warning, we can re-visit once we change the System.setIn/setOut/setErr methods to not use JNI (I prefer to keep this separate to this PR because there is a small startup regression to address when changing System.setXXX). > > There are many new tests. A small number of existing tests are changed to run /othervm as reflectively opening a package isn't sufficient. Changing the tests to /othervm means that jtreg will launch the agent with the command line options to open the package. > > Testing: tier1-6 This pull request has now been integrated. Changeset: 26460b6f Author: Alan Bateman URL: https://git.openjdk.org/jdk/commit/26460b6f12ce0763b79acfd98fca260b509a82c5 Stats: 5365 lines in 76 files changed: 5170 ins; 55 del; 140 mod 8353835: Implement JEP 500: Prepare to Make Final Mean Final Reviewed-by: liach, vlivanov, dholmes, vyazici ------------- PR: https://git.openjdk.org/jdk/pull/25115 From aph at openjdk.org Tue Nov 18 08:56:27 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 18 Nov 2025 08:56:27 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: <2XfGFJ9H0rbFJ-1Z9I1pKw6Uiv8UvfCFdwIaH1qOMf0=.ebb13a09-0fed-4dd3-b0e5-287709c2faed@github.com> On Thu, 13 Nov 2025 19:35:53 GMT, Erik ?sterlund wrote: >>> Hi Erik (@fisk), >>> >>> Could you also please take a look, just in case the fence was intentionally put there? >> >> The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. >> >> It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. >> >> If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence. > >> @fisk , I'm assuming that no other thread is executing the target instructions while were patching them. > > Indeed; no concurrent thread is executing the instructions being modified. > > > @fisk , I'm assuming that no other thread is executing the target instructions while were patching them. > > > > > > Indeed; no concurrent thread is executing the instructions being modified. > > So, this confirms the redundancy of the `fence`, doesn't it? Not really, no, I was just checking. I'm pretty sure that the fence is redundant, though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3546270587 From shade at openjdk.org Tue Nov 18 09:13:45 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 18 Nov 2025 09:13:45 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 17:41:56 GMT, Kurt Miller wrote: > ?rGenerator::generate_native_entry > > I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: > > > // get native function entry point in r10 > { > Label L; > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); > __ lea(rscratch2, unsatisfied); > __ ldr(rscratch2, rscratch2); > __ cmp(r10, rscratch2); > __ br(Assembler::NE, L); > __ call_VM(noreg, > CAST_FROM_FN_PTR(address, > InterpreterRuntime::prepare_native_call), > rmethod); > __ get_method(rmethod); > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > __ bind(L); > } > > > If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. > > This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. > > This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. > > Updated comment with markdown for code. Looks reasonable. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28327#pullrequestreview-3476362592 From iwalulya at openjdk.org Tue Nov 18 09:20:44 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 18 Nov 2025 09:20:44 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic In-Reply-To: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: <4odmbYm6mGDKMhMwYhwVXXFFnkgeJPOuqA5iCf_avv8=.6241d56d-d16d-4dd7-a8ce-eb9410cbb110@github.com> On Fri, 14 Nov 2025 18:35:10 GMT, Kim Barrett wrote: > Please review this change to the `LockFreeStack` utility to allow clients to > use `Atomic` as the type of the "next" member used in the linked-list > representation of the stack. It also continues to allow clients to use the old > (pre-`Atomic`) form where the "next" member is volatile. This allows > clients to be updated incrementally after this change, rather than requiring > all clients to be updated in conjunction with the update of this class. Once > all clients have been updated, support for the old form can be removed. > > The associated gtests have been updated to use `Atomic`, with testing of > the old form is no longer being done. The non-updated uses provide some > testing, and that's all expected to go away soon. So parameterizing the gtests > for both forms seems like a bunch of work that will just be deleted soon, with > very little benefit. > > Testing: mach5 tier1 Nit! src/hotspot/share/utilities/lockFreeStack.hpp line 59: > 57: // \tparam T is the class of the elements in the stack. > 58: // > 59: // \tparam next_access is a function pointer. Applying this function to Maybe `next_accessor`? I found reading `next_access` difficult because it?s unclear whether it?s meant as a verb (i.e., ?access next?) or a noun (i.e., ?next access?). ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28329#pullrequestreview-3476385407 PR Review Comment: https://git.openjdk.org/jdk/pull/28329#discussion_r2537029019 From fandreuzzi at openjdk.org Tue Nov 18 09:23:38 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 18 Nov 2025 09:23:38 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:27:24 GMT, Erik Gahlin wrote: >> Francesco Andreuzzi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: >> >> - remove elapsed. remove idle >> - Merge branch 'master' into JDK-8037914 >> - rename. start/end time >> - no start >> - enable >> - bytes to size >> - disable >> - revert >> - one event >> - trailing >> - ... and 5 more: https://git.openjdk.org/jdk/compare/c7ce9f21...fc47a64e > > From a JFR perspective, this looks good. Ideally, the values of the event should be sanity-checked, but I understand this might be tricky to do in a reliable manner. Hunting down false positives would just be a waste of time. The copyright year of the test should be 2025. > > It would be good if someone on the GC team could take a look at the GC-related code. Thanks for the review @egahlin and @albertnetymk. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3546398933 From duke at openjdk.org Tue Nov 18 09:23:39 2025 From: duke at openjdk.org (duke) Date: Tue, 18 Nov 2025 09:23:39 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:45:29 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > fix year @fandreuz Your change (at version 40829ead2e80bab673d4852914eabfdee72dc7ce) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3546402253 From mli at openjdk.org Tue Nov 18 09:27:44 2025 From: mli at openjdk.org (Hamlin Li) Date: Tue, 18 Nov 2025 09:27:44 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v5] In-Reply-To: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: > Hi, > > This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. > > This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. > > Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. > > # Test > ## Jtreg > > in progress... > > ## Performance > > Column names meanings: > * p: with patch > * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > * m: without patch > * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > > #### Average improvement > > NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. > > For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. > > Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) > -- | -- | -- | -- > 1.022782609 | 2.198717391 | 2.162673913 | 2.199 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: replace assert with log_warning ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28309/files - new: https://git.openjdk.org/jdk/pull/28309/files/cf9168a2..572a7b74 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=03-04 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28309/head:pull/28309 PR: https://git.openjdk.org/jdk/pull/28309 From ayang at openjdk.org Tue Nov 18 09:40:37 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 18 Nov 2025 09:40:37 GMT Subject: RFR: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch [v4] In-Reply-To: References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Mon, 17 Nov 2025 09:35:49 GMT, Albert Mingkun Yang wrote: >> Trivial removing obsoleted code for unsupported arch. >> >> Test: tier1 > > Albert Mingkun Yang has updated the pull request incrementally with two additional commits since the last revision: > > - review > - patch Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28240#issuecomment-3546476614 From ayang at openjdk.org Tue Nov 18 09:40:38 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 18 Nov 2025 09:40:38 GMT Subject: Integrated: 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch In-Reply-To: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> References: <73HZdkIlDts3as9Xfechu5Sj4RhDGpUx-HVxj6B9m5o=.d147e8ff-7488-4358-af54-956b966d499d@github.com> Message-ID: On Tue, 11 Nov 2025 15:42:21 GMT, Albert Mingkun Yang wrote: > Trivial removing obsoleted code for unsupported arch. > > Test: tier1 This pull request has now been integrated. Changeset: 50a30497 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/50a30497370799e8f377a11914562a15b0a48fbb Stats: 69 lines in 7 files changed: 0 ins; 66 del; 3 mod 8371643: Remove ThreadLocalAllocBuffer::_reserve_for_allocation_prefetch Reviewed-by: mdoerr, kvn, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/28240 From sgehwolf at openjdk.org Tue Nov 18 09:42:26 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 18 Nov 2025 09:42:26 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v6] In-Reply-To: <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> Message-ID: On Fri, 14 Nov 2025 15:11:02 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: > > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - ... and 15 more: https://git.openjdk.org/jdk/compare/5d65c23c...9a5f3eb5 OK. Here it goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3546491920 From sgehwolf at openjdk.org Tue Nov 18 09:46:45 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 18 Nov 2025 09:46:45 GMT Subject: Integrated: 8365606: Container code should not be using jlong/julong In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Fri, 10 Oct 2025 13:09:48 GMT, Severin Gehwolf wrote: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? This pull request has now been integrated. Changeset: 72ebca8a Author: Severin Gehwolf URL: https://git.openjdk.org/jdk/commit/72ebca8a0b19fac8a9483e5a3a98b454176fc342 Stats: 1308 lines in 16 files changed: 514 ins; 106 del; 688 mod 8365606: Container code should not be using jlong/julong Reviewed-by: stuefe, cnorrbin, fitzsim ------------- PR: https://git.openjdk.org/jdk/pull/27743 From fandreuzzi at openjdk.org Tue Nov 18 09:46:47 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 18 Nov 2025 09:46:47 GMT Subject: Integrated: 8037914: Add JFR event for string deduplication In-Reply-To: References: Message-ID: On Tue, 28 Oct 2025 10:09:58 GMT, Francesco Andreuzzi wrote: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). This pull request has now been integrated. Changeset: 3a2845f3 Author: Francesco Andreuzzi Committer: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/3a2845f334a59670d54699919073f0e908c038c4 Stats: 261 lines in 9 files changed: 244 ins; 10 del; 7 mod 8037914: Add JFR event for string deduplication Reviewed-by: ayang, egahlin ------------- PR: https://git.openjdk.org/jdk/pull/28015 From mdoerr at openjdk.org Tue Nov 18 10:09:11 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 18 Nov 2025 10:09:11 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v2] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 23:49:28 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Add an assertion to detect out of bounds access in post-call NOP checks Our tests haven't revealed any new issues related to this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3546631812 From stuefe at openjdk.org Tue Nov 18 10:44:15 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 18 Nov 2025 10:44:15 GMT Subject: RFR: 8340297: Use-after-free recognition for metaspace and class space [v10] In-Reply-To: References: Message-ID: On Fri, 24 Oct 2025 06:37:43 GMT, Thomas Stuefe wrote: >> This patch will give us use-after-free recognition for Metaspace and Class space. >> >> Currently, checks for Klass validity typically only perform a variation of `Metaspace::contains` and some other basic tests. These checks won't find cases where the Klass had been prematurely freed (e.g., after class redefinition), nor cases of unloaded classes if the underlying metaspace chunks have not been uncommitted, which is quite common. >> >> The patch also provides us with improved analysis methods in case we encounter problems. E.g., answering whether the Klass had been redefined or unloaded. >> >> The implementation aims to be simple, fast, and safe against false positives. There is a small but non-null chance that we could get false negatives, but that cannot be avoided. >> >> How this works: >> >> - In `class Metadata`, we introduce a 32-bit token that holds the type of the object (1). It replaces the old "is_valid" field of the same size. That one was of limited use since any non-null garbage in those four bytes would be read as valid. >> - To check a Metadata for validity, the token is checked. Checks are done with SafeFetch, so they can be done with questionable pointers (e.g. into uncommitted metaspace after class unloading) >> - When metaspace is freed (bulk free after class unloading), the released chunks are zapped, destroying all tokens in the area. >> - When metaspace is freed (prematurely, e.g., after class redefinition), the released blocks are zapped. >> - The new checks replace Metadata::is_valid and supplement some other metadata checks done in GCs >> >> Testing: The patch has been extensively tested manually, at Oracle, and SAP. Tests were thorough to not only catch errors in the patch, but also to see if the patch would uncover a lot of existing sleeper bugs. So far, we only found a single bug in Shenandoah. >> >> Note: I did not yet hook up the new test to c1/c2 compiled code (there are already unimplemented functions for that). That is possible, but left for a later RFE. > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > includes sorted Ping? @shipilev maybe? This has been your original request, remember :-) ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25891#issuecomment-3546843491 From kbarrett at openjdk.org Tue Nov 18 10:48:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 18 Nov 2025 10:48:59 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic [v2] In-Reply-To: <4odmbYm6mGDKMhMwYhwVXXFFnkgeJPOuqA5iCf_avv8=.6241d56d-d16d-4dd7-a8ce-eb9410cbb110@github.com> References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> <4odmbYm6mGDKMhMwYhwVXXFFnkgeJPOuqA5iCf_avv8=.6241d56d-d16d-4dd7-a8ce-eb9410cbb110@github.com> Message-ID: <8oa8aTJGSpDpni9qt7YgydWTPhuYzP_cXyL5HDPEZsk=.ccb7ab7f-95fd-40fe-b8e9-930aff7d6992@github.com> On Tue, 18 Nov 2025 09:15:48 GMT, Ivan Walulya wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into lock-free-stack-allows-new-atomic >> - rename next_access to next_accessor >> - LockFreeStack supports Atomic > > src/hotspot/share/utilities/lockFreeStack.hpp line 59: > >> 57: // \tparam T is the class of the elements in the stack. >> 58: // >> 59: // \tparam next_access is a function pointer. Applying this function to > > Maybe `next_accessor`? I found reading `next_access` difficult because it?s unclear whether it?s meant as a verb (i.e., ?access next?) or a noun (i.e., ?next access?). That's a fair point. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28329#discussion_r2537486877 From kbarrett at openjdk.org Tue Nov 18 10:48:57 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 18 Nov 2025 10:48:57 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic [v2] In-Reply-To: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: > Please review this change to the `LockFreeStack` utility to allow clients to > use `Atomic` as the type of the "next" member used in the linked-list > representation of the stack. It also continues to allow clients to use the old > (pre-`Atomic`) form where the "next" member is volatile. This allows > clients to be updated incrementally after this change, rather than requiring > all clients to be updated in conjunction with the update of this class. Once > all clients have been updated, support for the old form can be removed. > > The associated gtests have been updated to use `Atomic`, with testing of > the old form is no longer being done. The non-updated uses provide some > testing, and that's all expected to go away soon. So parameterizing the gtests > for both forms seems like a bunch of work that will just be deleted soon, with > very little benefit. > > Testing: mach5 tier1 Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into lock-free-stack-allows-new-atomic - rename next_access to next_accessor - LockFreeStack supports Atomic ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28329/files - new: https://git.openjdk.org/jdk/pull/28329/files/28b4d2a2..58c4ee09 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28329&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28329&range=00-01 Stats: 11620 lines in 216 files changed: 8511 ins; 1590 del; 1519 mod Patch: https://git.openjdk.org/jdk/pull/28329.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28329/head:pull/28329 PR: https://git.openjdk.org/jdk/pull/28329 From iwalulya at openjdk.org Tue Nov 18 10:50:33 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 18 Nov 2025 10:50:33 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic [v2] In-Reply-To: References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: On Tue, 18 Nov 2025 10:48:57 GMT, Kim Barrett wrote: >> Please review this change to the `LockFreeStack` utility to allow clients to >> use `Atomic` as the type of the "next" member used in the linked-list >> representation of the stack. It also continues to allow clients to use the old >> (pre-`Atomic`) form where the "next" member is volatile. This allows >> clients to be updated incrementally after this change, rather than requiring >> all clients to be updated in conjunction with the update of this class. Once >> all clients have been updated, support for the old form can be removed. >> >> The associated gtests have been updated to use `Atomic`, with testing of >> the old form is no longer being done. The non-updated uses provide some >> testing, and that's all expected to go away soon. So parameterizing the gtests >> for both forms seems like a bunch of work that will just be deleted soon, with >> very little benefit. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into lock-free-stack-allows-new-atomic > - rename next_access to next_accessor > - LockFreeStack supports Atomic Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28329#pullrequestreview-3476979616 From fjiang at openjdk.org Tue Nov 18 13:26:12 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 18 Nov 2025 13:26:12 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC In-Reply-To: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: <1IswLlUhLzPVM9l3wWGs-ckKQdi2BhXblweLvRDsIE8=.d1008183-9759-49de-a638-d7848b8183a4@github.com> On Sun, 16 Nov 2025 15:24:04 GMT, Fei Yang wrote: > Hi, please consider this riscv-specific change. > > I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63: > > `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` > > The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. > I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. > This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. > > After this change, the log on BPI-F3 SBC looks like: > > $ java -Xlog:all -version > > ...... > [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. > [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. > [0.011s][info][os,cpu ] Enabled RV64 feature "a" > [0.011s][info][os,cpu ] Enabled RV64 feature "c" > [0.011s][info][os,cpu ] Enabled RV64 feature "d" > [0.011s][info][os,cpu ] Enabled RV64 feature "f" > [0.011s][info][os,cpu ] Enabled RV64 feature "i" > [0.011s][info][os,cpu ] Enabled RV64 feature "m" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" > [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) > [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) > [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) > [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) > [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) > [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) > [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) > [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. > [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin > ...... Thanks! ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/28340#pullrequestreview-3477857007 From coleenp at openjdk.org Tue Nov 18 13:46:08 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 13:46:08 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 11:59:29 GMT, Anton Artemov wrote: >> Hi, >> >> please consider the following changes: >> >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. >> >> Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. >> >> Tested in tiers 1 - 5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8366671: Removed redundant include. I'd rather this be a general purpose synchronization utility that's used with care rather than something buried in thread code that nobody finds and creates one of their own. That there's only three uses of this is a good thing. It's preferable to use Mutex. The commentary that Kim suggested would be very good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28264#issuecomment-3547701344 From coleenp at openjdk.org Tue Nov 18 13:46:10 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 13:46:10 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 04:58:10 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Removed redundant include. > > src/hotspot/share/runtime/objectMonitor.cpp line 320: > >> 318: check_object_context(); >> 319: if (_object_strong.is_empty()) { >> 320: auto setObjectStrongLambda = [&](OopHandle& object_strong, const WeakHandle& object) { > > I don't understand why we need the complexity of the `SpinSingleSection` and use of lambda's/functors. > > This seems like try-lock usage, though I'm not at all sure why (i.e. if we don't get the lock who is taking care of making a strong reference?) I think we agree about SpinSingleSection that it's sort of overkill we shouldn't add it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2538280821 From erikj at openjdk.org Tue Nov 18 14:03:07 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Tue, 18 Nov 2025 14:03:07 GMT Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3] In-Reply-To: References: <6VJIZnsd8K7SI36fEz884BPJi7dctBZxgnBJDOSElgc=.01ec687a-5b8c-4d09-b826-0f155ef71db7@github.com> Message-ID: On Tue, 18 Nov 2025 08:13:52 GMT, Matthias Baesken wrote: > > Did you consider using `--icf=safe` instead? Using `--icf==all` seems to be risky. > > I tried this too, it works but the binaries are a bit larger with this setting, but I have no detailed values. Regarding 'risky' I did not see any test issues with the flag added in our CI . I would like input from someone in Hotspot regarding implications on debugging and if the changes to memory layout could cause any issues. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28236#issuecomment-3547771623 From duke at openjdk.org Tue Nov 18 14:18:11 2025 From: duke at openjdk.org (Ruben) Date: Tue, 18 Nov 2025 14:18:11 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v2] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 10:05:48 GMT, Martin Doerr wrote: >> Ruben has updated the pull request incrementally with one additional commit since the last revision: >> >> Add an assertion to detect out of bounds access in post-call NOP checks > > Our tests haven't revealed any new issues related to this PR. Thank you, @TheRealMDoerr, > I think assertions would be sufficient in C1 instead of guarantee. Sure, I will change these to assertions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3547827280 From mgronlun at openjdk.org Tue Nov 18 14:31:34 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 18 Nov 2025 14:31:34 GMT Subject: RFR: 8372045: AOT assembly phase asserts with old class if AOT class linking is disabled In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 05:05:09 GMT, Ioi Lam wrote: > Old classes should be stored in the AOT cache only if `CDSConfig::is_preserving_verification_constraints() == true`. However, we miss this check in the AOT assembly phase: the `this` class is loaded from the AOT configuration file, which is a special type of AOT cache, so `AOTMetaspace::in_aot_cache(this)` returns true: > > > bool InstanceKlass::can_be_verified_at_dumptime() const { > if (AOTMetaspace::in_aot_cache(this)) { > // This is a class that was dumped into the base archive, so we know > // it was verified at dump time. > return true; > } > > > The fix is > > ``` > bool InstanceKlass::can_be_verified_at_dumptime() const { > if (CDSConfig::is_dumping_dynamic_archive() && AOTMetaspace::in_aot_cache(this)) { > > > as this check is intended to be used only when dumping the dynamic archive. > > This bug was found when running a complex application (specJBB) but I created a simple reproducer (OldClassSupport2.java). Thanks for fixing this, Ioi. ------------- Marked as reviewed by mgronlun (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28365#pullrequestreview-3478108336 From pchilanomate at openjdk.org Tue Nov 18 14:56:51 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 18 Nov 2025 14:56:51 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 13:42:33 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/objectMonitor.cpp line 320: >> >>> 318: check_object_context(); >>> 319: if (_object_strong.is_empty()) { >>> 320: auto setObjectStrongLambda = [&](OopHandle& object_strong, const WeakHandle& object) { >> >> I don't understand why we need the complexity of the `SpinSingleSection` and use of lambda's/functors. >> >> This seems like try-lock usage, though I'm not at all sure why (i.e. if we don't get the lock who is taking care of making a strong reference?) > > I think we agree about SpinSingleSection that it's sort of overkill we shouldn't add it. +1 on leaving the code as it was. This makes it harder to read IMO. Also as David points out there is no spinning here so using a SpinSingleSection class would add extra confusion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2538511301 From pchilanomate at openjdk.org Tue Nov 18 14:56:51 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 18 Nov 2025 14:56:51 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 14:52:55 GMT, Patricio Chilano Mateo wrote: >> I think we agree about SpinSingleSection that it's sort of overkill we shouldn't add it. > > +1 on leaving the code as it was. This makes it harder to read IMO. Also as David points out there is no spinning here so using a SpinSingleSection class would add extra confusion. >This seems like try-lock usage, though I'm not at all sure why (i.e. if we don't get the lock who is taking care of making a strong reference?) > This lock is only used here, so if the try-lock fails somebody else already grabbed it and that thread will create the strong reference (if not created already). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2538514233 From mli at openjdk.org Tue Nov 18 15:03:54 2025 From: mli at openjdk.org (Hamlin Li) Date: Tue, 18 Nov 2025 15:03:54 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC In-Reply-To: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: On Sun, 16 Nov 2025 15:24:04 GMT, Fei Yang wrote: > Hi, please consider this riscv-specific change. > > I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63: > > `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` > > The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. > I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. > This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. > > After this change, the log on BPI-F3 SBC looks like: > > $ java -Xlog:all -version > > ...... > [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. > [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. > [0.011s][info][os,cpu ] Enabled RV64 feature "a" > [0.011s][info][os,cpu ] Enabled RV64 feature "c" > [0.011s][info][os,cpu ] Enabled RV64 feature "d" > [0.011s][info][os,cpu ] Enabled RV64 feature "f" > [0.011s][info][os,cpu ] Enabled RV64 feature "i" > [0.011s][info][os,cpu ] Enabled RV64 feature "m" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" > [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) > [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) > [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) > [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) > [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) > [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) > [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) > [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. > [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin > ...... src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 122: > 120: } > 121: > 122: void VM_Version::RVNonExtFeatureValue::log_disabled(const char* reason) { Is `log_disabled` in RVNonExtFeatureValue invoked in some condition? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28340#discussion_r2538539377 From mli at openjdk.org Tue Nov 18 15:08:03 2025 From: mli at openjdk.org (Hamlin Li) Date: Tue, 18 Nov 2025 15:08:03 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC In-Reply-To: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: <-qb8rWPJSXl2dI-tbY1W35-w-cj18RFmYYyC5PaPdF4=.90c7c9fc-5d5d-4fae-b0ee-ea9ce487e50f@github.com> On Sun, 16 Nov 2025 15:24:04 GMT, Fei Yang wrote: > Hi, please consider this riscv-specific change. > > I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63: > > `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` > > The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. > I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. > This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. > > After this change, the log on BPI-F3 SBC looks like: > > $ java -Xlog:all -version > > ...... > [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. > [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. > [0.011s][info][os,cpu ] Enabled RV64 feature "a" > [0.011s][info][os,cpu ] Enabled RV64 feature "c" > [0.011s][info][os,cpu ] Enabled RV64 feature "d" > [0.011s][info][os,cpu ] Enabled RV64 feature "f" > [0.011s][info][os,cpu ] Enabled RV64 feature "i" > [0.011s][info][os,cpu ] Enabled RV64 feature "m" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" > [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) > [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) > [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) > [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) > [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) > [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) > [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) > [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. > [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin > ...... Thanks for fixing this. Have some minor comments. src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 110: > 108: } > 109: > 110: void VM_Version::RVExtFeatureValue::log_disabled(const char* reason) { Seems we don't need to distinguish whether `reason` is nullptr or not? as `log_disabled` is called only when it's disabled because of depenency check failure. ------------- PR Review: https://git.openjdk.org/jdk/pull/28340#pullrequestreview-3478282171 PR Review Comment: https://git.openjdk.org/jdk/pull/28340#discussion_r2538551213 From duke at openjdk.org Tue Nov 18 15:13:16 2025 From: duke at openjdk.org (Zihao Lin) Date: Tue, 18 Nov 2025 15:13:16 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v12] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - fix conflict - Merge master - remove C2AccessValuePtr - fix assert - add more assert - rid of access.addr().type() - Merge branch 'openjdk:master' into 8344116 - Merge branch 'openjdk:master' into 8344116 - ... and 4 more: https://git.openjdk.org/jdk/compare/dcba014a...329e290a ------------- Changes: https://git.openjdk.org/jdk/pull/24258/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=11 Stats: 230 lines in 18 files changed: 33 ins; 55 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From coleenp at openjdk.org Tue Nov 18 15:43:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 15:43:00 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass Message-ID: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). Tested with tier1-4. 5-7 in progress. ------------- Commit messages: - Fix C2 to test for array first. - Move AccessFlags to InstanceKlass - array classes don't set access flags so don't look for them there. Changes: https://git.openjdk.org/jdk/pull/28371/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372098 Stats: 155 lines in 29 files changed: 62 ins; 54 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From liach at openjdk.org Tue Nov 18 15:43:03 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 18 Nov 2025 15:43:03 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 13:27:06 GMT, Coleen Phillimore wrote: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. src/hotspot/share/oops/constantPool.cpp line 1228: > 1226: > 1227: // Check constant pool method consistency > 1228: InstanceKlass* callee = InstanceKlass::cast(k); I know a MethodRef can be `[I`, `clone`, `()Ljava/lang/Object;` for `intArray.clone()` Java calls translated by javac. I wonder if this new code would break for such an array callee class. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2538494116 From mbaesken at openjdk.org Tue Nov 18 15:50:34 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 18 Nov 2025 15:50:34 GMT Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3] In-Reply-To: References: <6VJIZnsd8K7SI36fEz884BPJi7dctBZxgnBJDOSElgc=.01ec687a-5b8c-4d09-b826-0f155ef71db7@github.com> Message-ID: On Tue, 18 Nov 2025 14:00:33 GMT, Erik Joelsson wrote: > I would like input from someone in Hotspot regarding implications on debugging and if the changes to memory layout could cause any issues. Makes of course sense to ask the hotspot developers too. Implications on debugging would mean that you potentially see the 'folded' method/function name in the debugger. But that should occur too with the opt:icf,8 setting on Windows that is used for some years. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28236#issuecomment-3548301308 From coleenp at openjdk.org Tue Nov 18 15:58:15 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 15:58:15 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 14:48:37 GMT, Chen Liang wrote: >> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. >> Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). >> Tested with tier1-4. 5-7 in progress. > > src/hotspot/share/oops/constantPool.cpp line 1228: > >> 1226: >> 1227: // Check constant pool method consistency >> 1228: InstanceKlass* callee = InstanceKlass::cast(k); > > I know a MethodRef can be `[I`, `clone`, `()Ljava/lang/Object;` for `intArray.clone()` Java calls translated by javac. I wonder if this new code would break for such an array callee class. At one point, I removed is_interface() from class Klass, but then restored it because dependencies uses this a lot and has many Klass parameter types, instead of InstanceKlass. I'll revert this change, but I'm curious why none of the tests failed with this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2538747727 From iklam at openjdk.org Tue Nov 18 18:15:58 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 18:15:58 GMT Subject: RFR: 8372045: AOT assembly phase asserts with old class if AOT class linking is disabled In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 07:44:39 GMT, Aleksey Shipilev wrote: >> Old classes should be stored in the AOT cache only if `CDSConfig::is_preserving_verification_constraints() == true`. However, we miss this check in the AOT assembly phase: the `this` class is loaded from the AOT configuration file, which is a special type of AOT cache, so `AOTMetaspace::in_aot_cache(this)` returns true: >> >> >> bool InstanceKlass::can_be_verified_at_dumptime() const { >> if (AOTMetaspace::in_aot_cache(this)) { >> // This is a class that was dumped into the base archive, so we know >> // it was verified at dump time. >> return true; >> } >> >> >> The fix is >> >> ``` >> bool InstanceKlass::can_be_verified_at_dumptime() const { >> if (CDSConfig::is_dumping_dynamic_archive() && AOTMetaspace::in_aot_cache(this)) { >> >> >> as this check is intended to be used only when dumping the dynamic archive. >> >> This bug was found when running a complex application (specJBB) but I created a simple reproducer (OldClassSupport2.java). > > Marked as reviewed by shade (Reviewer). Thanks @shipilev @mgronlun for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/28365#issuecomment-3548946436 From iklam at openjdk.org Tue Nov 18 18:15:59 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 18:15:59 GMT Subject: Integrated: 8372045: AOT assembly phase asserts with old class if AOT class linking is disabled In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 05:05:09 GMT, Ioi Lam wrote: > Old classes should be stored in the AOT cache only if `CDSConfig::is_preserving_verification_constraints() == true`. However, we miss this check in the AOT assembly phase: the `this` class is loaded from the AOT configuration file, which is a special type of AOT cache, so `AOTMetaspace::in_aot_cache(this)` returns true: > > > bool InstanceKlass::can_be_verified_at_dumptime() const { > if (AOTMetaspace::in_aot_cache(this)) { > // This is a class that was dumped into the base archive, so we know > // it was verified at dump time. > return true; > } > > > The fix is > > ``` > bool InstanceKlass::can_be_verified_at_dumptime() const { > if (CDSConfig::is_dumping_dynamic_archive() && AOTMetaspace::in_aot_cache(this)) { > > > as this check is intended to be used only when dumping the dynamic archive. > > This bug was found when running a complex application (specJBB) but I created a simple reproducer (OldClassSupport2.java). This pull request has now been integrated. Changeset: b3e408c0 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/b3e408c07891b58a312a58ffd756d6a1d18c0f6d Stats: 105 lines in 2 files changed: 104 ins; 0 del; 1 mod 8372045: AOT assembly phase asserts with old class if AOT class linking is disabled Reviewed-by: shade, mgronlun ------------- PR: https://git.openjdk.org/jdk/pull/28365 From iklam at openjdk.org Tue Nov 18 18:17:28 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 18:17:28 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v14] In-Reply-To: References: Message-ID: <0nOY5HF3KuwaVmZeezHv2Davwb9-qau2atRsDdrVGVk=.76916bde-e9ed-4b88-810f-a7d10816ac75@github.com> On Mon, 17 Nov 2025 21:28:58 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > update STATIC_ASSERT->static_assert Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26098#pullrequestreview-3479133931 From duke at openjdk.org Tue Nov 18 18:17:29 2025 From: duke at openjdk.org (duke) Date: Tue, 18 Nov 2025 18:17:29 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v14] In-Reply-To: References: Message-ID: <9-d--ESCpf-XW_THFvTahwiEe0lKTgg9c2_F2vX_aRI=.cd0ef9fb-1d63-4593-bd59-790aeef5f7f3@github.com> On Mon, 17 Nov 2025 21:28:58 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > update STATIC_ASSERT->static_assert @jankratochvil Your change (at version f85f1066239d3a5a36f9220385245607da0add75) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26098#issuecomment-3548967397 From duke at openjdk.org Tue Nov 18 18:21:35 2025 From: duke at openjdk.org (ExE Boss) Date: Tue, 18 Nov 2025 18:21:35 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 13:27:06 GMT, Coleen Phillimore wrote: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. src/hotspot/share/classfile/classFileParser.cpp line 815: > 813: interface_index, CHECK); > 814: if (cp->tag_at(interface_index).is_klass()) { > 815: interf = InstanceKlass::cast(cp->resolved_klass_at(interface_index)); Note?that a?resolved `CONSTANT_Class` can refer to an array?type, so?this?cast is?incorrect. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539226107 From coleenp at openjdk.org Tue Nov 18 18:55:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 18:55:03 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 18:15:40 GMT, ExE Boss wrote: >> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. >> Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). >> Tested with tier1-4. 5-7 in progress. > > src/hotspot/share/classfile/classFileParser.cpp line 815: > >> 813: interface_index, CHECK); >> 814: if (cp->tag_at(interface_index).is_klass()) { >> 815: interf = InstanceKlass::cast(cp->resolved_klass_at(interface_index)); > > Note?that a?resolved `CONSTANT_Class` can refer to an array?type, so?this?cast is?incorrect. There are a bunch of tests that we don't have. This would be an error since Interfaces are never arrays, but that's checked later. I'll revert some of these casts (as well as try to write a test for this). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539318553 From aph at openjdk.org Tue Nov 18 19:07:35 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 18 Nov 2025 19:07:35 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 20:41:56 GMT, Dhamoder Nalla wrote: > This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. > Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). src/hotspot/cpu/aarch64/macroAssembler_aarch64_log.cpp line 271: > 269: fmovd(rscratch1, v0); // rscratch1 = AS_LONG_BITS(X) > 270: lea(rscratch2, ExternalAddress((address)_L_tbl)); > 271: movz(tmp5, 0x7F); Comments are needed here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2533464197 From dhanalla at openjdk.org Tue Nov 18 19:07:34 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Tue, 18 Nov 2025 19:07:34 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 Message-ID: This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). ------------- Commit messages: - [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 - [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 - [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 - [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 - [AArch64] Math.log is 10% slower than StrictMath.log on macosx-aarch64 - [AArch64] Math.log is 10% slower than StrictMath.log on macosx-aarch64 Changes: https://git.openjdk.org/jdk/pull/28306/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28306&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308776 Stats: 544 lines in 7 files changed: 541 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28306.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28306/head:pull/28306 PR: https://git.openjdk.org/jdk/pull/28306 From liach at openjdk.org Tue Nov 18 20:05:14 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 18 Nov 2025 20:05:14 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 18:50:55 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/classFileParser.cpp line 815: >> >>> 813: interface_index, CHECK); >>> 814: if (cp->tag_at(interface_index).is_klass()) { >>> 815: interf = InstanceKlass::cast(cp->resolved_klass_at(interface_index)); >> >> Note?that a?resolved `CONSTANT_Class` can refer to an array?type, so?this?cast is?incorrect. > > There are a bunch of tests that we don't have. This would be an error since Interfaces are never arrays, but that's checked later. I'll revert some of these casts (as well as try to write a test for this). I thought the cast at line 839 would have handled this. Turns out it has a `!interf->is_interface()` check before so this cast is problematic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539538759 From matsaave at openjdk.org Tue Nov 18 20:20:56 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 18 Nov 2025 20:20:56 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters Message-ID: The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. ------------- Commit messages: - 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters Changes: https://git.openjdk.org/jdk/pull/28380/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28380&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347248 Stats: 14 lines in 6 files changed: 9 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28380.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28380/head:pull/28380 PR: https://git.openjdk.org/jdk/pull/28380 From iklam at openjdk.org Tue Nov 18 21:43:31 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 21:43:31 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:12:20 GMT, Matias Saavedra Silva wrote: > The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 857: > 855: void CodeInstaller::initialize_fields(HotSpotCompiledCodeStream* stream, u1 code_flags, methodHandle& method, CodeBuffer& buffer, JVMCI_TRAPS) { > 856: if (!method.is_null()) { > 857: _parameter_count = method->number_of_parameters(); _parameter_count is used to create a OopMap, so it must be in the number of words that are ocupied by all the arguments, not the number of parameters. https://github.com/openjdk/jdk/blob/27a38d9093958ae4851bc61b8d3f0d71dc780823/src/hotspot/share/jvmci/jvmciCodeInstaller.cpp#L263 src/hotspot/share/prims/jni.cpp line 871: > 869: ResourceMark rm(THREAD); > 870: int size_of_parameters = method->size_of_parameters(); > 871: JavaCallArguments java_args(size_of_parameters); The original code was harmless. It creates the JavaCallArguments with more space than necessary, but doesn't affect the actual number of parameters that are passed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539653885 PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539631698 From epeter at openjdk.org Tue Nov 18 21:49:40 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 18 Nov 2025 21:49:40 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 In-Reply-To: References: Message-ID: <9s7p49-bYB_amcD0q2XuEpeNPy4Ud1p38pE-UHEyB7c=.b9ef99f0-2381-4a83-856a-bc0dd273f4f1@github.com> On Thu, 13 Nov 2025 20:41:56 GMT, Dhamoder Nalla wrote: > This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. > Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). Drive-by, cannot promise a full review. But I'm interested ;) Mostly, I have questions about testing. Are there already tests for accuracy somewhere? Do you have any benchmark results to support this PR? It would be good if we had a way to prove that performance is good for all sorts of inputs. I suppose we don't have any loops here, so we should just make sure to benchmark cases so that all possible paths of the intrinsic are covered, right? src/hotspot/cpu/aarch64/c1_LIRGenerator_aarch64.cpp line 835: > 833: break; > 834: case vmIntrinsics::_dlog: > 835: // Math.log intrinsic is not implemented on AArch64 (see JDK-8210858), Drive-by comment, and since you are removing this comment: What is the state of https://bugs.openjdk.org/browse/JDK-8210858 ? src/hotspot/cpu/aarch64/macroAssembler_aarch64_log.cpp line 3: > 1: /* Copyright (c) 2018, Cavium. All rights reserved. (By BELLSOFT) > 2: * Copyright (c) 2016, 2021, Intel Corporation. All rights reserved. > 3: * Intel Math Library (LIBM) Source Code Are the dates supposed to be updated? Maybe not, just asking. test/jdk/java/lang/Math/TestLogMinValue.java line 28: > 26: * @bug 8308776 > 27: * @build Tests > 28: * @summary Compare Math.log and StrictMath.log for Double.MIN_VALUE (denormal smallest positive) to ensure consistency. Are there tests that check for consistency of the other values? test/jdk/java/lang/Math/TestLogMonotonicity.java line 53: > 51: double nv = v * 2.0; > 52: if (nv == v) > 53: break; Suggestion: if (nv == v) { break; } test/jdk/java/lang/Math/TestLogMonotonicity.java line 61: > 59: // Powers of two 2^1 .. 2^16 > 60: for (int i = 1; i <= 16; i++) { > 61: list.add(Math.pow(2.0, i)); It seems you now only cover powers of 2, right? Is this sufficient? I don't know what other tests already exist, so maybe this is already covered elsewhere? ------------- PR Review: https://git.openjdk.org/jdk/pull/28306#pullrequestreview-3479700917 PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539647684 PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539650297 PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539667697 PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539657565 PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539669991 From epeter at openjdk.org Tue Nov 18 21:49:41 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 18 Nov 2025 21:49:41 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 In-Reply-To: <9s7p49-bYB_amcD0q2XuEpeNPy4Ud1p38pE-UHEyB7c=.b9ef99f0-2381-4a83-856a-bc0dd273f4f1@github.com> References: <9s7p49-bYB_amcD0q2XuEpeNPy4Ud1p38pE-UHEyB7c=.b9ef99f0-2381-4a83-856a-bc0dd273f4f1@github.com> Message-ID: On Tue, 18 Nov 2025 20:49:33 GMT, Emanuel Peter wrote: >> This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. >> Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). > > test/jdk/java/lang/Math/TestLogMinValue.java line 28: > >> 26: * @bug 8308776 >> 27: * @build Tests >> 28: * @summary Compare Math.log and StrictMath.log for Double.MIN_VALUE (denormal smallest positive) to ensure consistency. > > Are there tests that check for consistency of the other values? Do we have tests that already check for sufficient accuracy? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539670255 From darcy at openjdk.org Tue Nov 18 21:49:45 2025 From: darcy at openjdk.org (Joe Darcy) Date: Tue, 18 Nov 2025 21:49:45 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 20:41:56 GMT, Dhamoder Nalla wrote: > This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. > Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). test/jdk/java/lang/Math/TestLogMinValue.java line 40: > 38: } > 39: if (mathLog != strictLog) { > 40: throw new AssertionError("Mismatch: Math.log=" + mathLog + " StrictMath.log=" + strictLog); Is this assertion justified by the Math.log specification? test/jdk/java/lang/Math/TestLogMonotonicity.java line 29: > 27: * @run main TestLogMonotonicity > 28: */ > 29: public class TestLogMonotonicity { So the test is checking for monotonicity over value that are 2X the previous value? That is a very weak test. In other math library regression tests we test for monotonicity on successive values. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539695976 PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2539698410 From coleenp at openjdk.org Tue Nov 18 21:50:35 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 21:50:35 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v2] In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert two InstanceKlass::cast() calls that might not be InstanceKlass. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28371/files - new: https://git.openjdk.org/jdk/pull/28371/files/80079012..e8973f59 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=00-01 Stats: 5 lines in 2 files changed: 0 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From coleenp at openjdk.org Tue Nov 18 21:50:38 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 18 Nov 2025 21:50:38 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v2] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 18 Nov 2025 20:02:09 GMT, Chen Liang wrote: >> There are a bunch of tests that we don't have. This would be an error since Interfaces are never arrays, but that's checked later. I'll revert some of these casts (as well as try to write a test for this). > > I thought the cast at line 839 would have handled this. Turns out it has a `!interf->is_interface()` check before so this cast is problematic. There's a reason this isn't tested. The constant pool reference for JVM_CONSTANT_Class is tested to be resolved at line 814 and because it's an interface, it's not easy to be resolved by this point (unless it's a duplicate class in which case it will be an InstanceKlass). The array case goes through the else part of this and throws a ClassFormatError. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2539645039 From iklam at openjdk.org Tue Nov 18 21:54:44 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 21:54:44 GMT Subject: RFR: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type [v14] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 21:28:58 GMT, Jan Kratochvil wrote: >> With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: >> >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning >> 49 | memset(this, 0, sizeof(*this)); >> | ^ >> | (void*) >> >> The patch follows the suggested fix. > > Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision: > > update STATIC_ASSERT->static_assert I tested the latest change in our CI with tiers 1, 2 and build-tier-5. No regression. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26098#issuecomment-3549573028 From jkratochvil at openjdk.org Tue Nov 18 21:54:45 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 18 Nov 2025 21:54:45 GMT Subject: Integrated: 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type In-Reply-To: References: Message-ID: On Wed, 2 Jul 2025 16:29:27 GMT, Jan Kratochvil wrote: > With clang-20 using --with-toolchain-type=clang resolveFieldEntry.cpp and resolveMethodEntry.cpp break the build with similar warnings like: > > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: error: first argument in call to 'memset' is a pointer to non-trivially copyable type 'ResolvedFieldEntry' [-Werror,-Wnontrivial-memcall] > 49 | memset(this, 0, sizeof(*this)); > | ^ > src/hotspot/share/oops/resolvedFieldEntry.cpp:49:10: note: explicitly cast the pointer to silence this warning > 49 | memset(this, 0, sizeof(*this)); > | ^ > | (void*) > > The patch follows the suggested fix. This pull request has now been integrated. Changeset: 66fb0152 Author: Jan Kratochvil Committer: Ioi Lam URL: https://git.openjdk.org/jdk/commit/66fb015267058f9b5e6788eaeaa758be56ba553e Stats: 129 lines in 4 files changed: 40 ins; 77 del; 12 mod 8357579: Compilation error: first argument in call to 'memset' is a pointer to non-trivially copyable type Co-authored-by: Ioi Lam Reviewed-by: iklam, asmehra ------------- PR: https://git.openjdk.org/jdk/pull/26098 From dholmes at openjdk.org Tue Nov 18 22:11:03 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Nov 2025 22:11:03 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:12:20 GMT, Matias Saavedra Silva wrote: > The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. src/hotspot/share/oops/method.cpp line 736: > 734: for (SignatureStream ss(signature()); !ss.at_return_type(); ss.next()) { > 735: count++; > 736: } Shouldn't we do this at construction time and store into a field and return that here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539756839 From iklam at openjdk.org Tue Nov 18 22:19:06 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 22:19:06 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 22:01:21 GMT, David Holmes wrote: >> The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. > > src/hotspot/share/oops/method.cpp line 736: > >> 734: for (SignatureStream ss(signature()); !ss.at_return_type(); ss.next()) { >> 735: count++; >> 736: } > > Shouldn't we do this at construction time and store into a field and return that here? That would increase footprint. It looks like we don't need this function after all. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539793227 From iklam at openjdk.org Tue Nov 18 22:19:08 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Nov 2025 22:19:08 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:28:38 GMT, Ioi Lam wrote: >> The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. > > src/hotspot/share/prims/jni.cpp line 871: > >> 869: ResourceMark rm(THREAD); >> 870: int size_of_parameters = method->size_of_parameters(); >> 871: JavaCallArguments java_args(size_of_parameters); > > The original code was harmless. It creates the JavaCallArguments with more space than necessary, but doesn't affect the actual number of parameters that are passed. My previous comment is wrong. The original code is correct. `JavaCallArguments::_max_size` is number of words. A `long` argument counts as two words. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539791878 From matsaave at openjdk.org Tue Nov 18 22:24:11 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 18 Nov 2025 22:24:11 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 22:16:18 GMT, Ioi Lam wrote: >> src/hotspot/share/oops/method.cpp line 736: >> >>> 734: for (SignatureStream ss(signature()); !ss.at_return_type(); ss.next()) { >>> 735: count++; >>> 736: } >> >> Shouldn't we do this at construction time and store into a field and return that here? > > That would increase footprint. It looks like we don't need this function after all. Oops I forgot to push the commit that caches this value ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2539805680 From duke at openjdk.org Tue Nov 18 22:35:38 2025 From: duke at openjdk.org (Ruben) Date: Tue, 18 Nov 2025 22:35:38 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v3] In-Reply-To: References: Message-ID: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Ruben has updated the pull request incrementally with one additional commit since the last revision: Replace `guarantee` with `assert` in the C1 `emit_deopt_handler` ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28192/files - new: https://git.openjdk.org/jdk/pull/28192/files/20cc58a3..3a014376 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28192&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28192&range=01-02 Stats: 10 lines in 5 files changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/28192.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28192/head:pull/28192 PR: https://git.openjdk.org/jdk/pull/28192 From eosterlund at openjdk.org Wed Nov 19 00:11:49 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 19 Nov 2025 00:11:49 GMT Subject: RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 19:35:53 GMT, Erik ?sterlund wrote: >>> Hi Erik (@fisk), >>> >>> Could you also please take a look, just in case the fence was intentionally put there? >> >> The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward. >> >> It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming. >> >> If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence. > >> @fisk , I'm assuming that no other thread is executing the target instructions while were patching them. > > Indeed; no concurrent thread is executing the instructions being modified. > > > > @fisk , I'm assuming that no other thread is executing the target instructions while were patching them. > > > > > > > > > > > > Indeed; no concurrent thread is executing the instructions being modified. > > > > > > So, this confirms the redundancy of the `fence`, doesn't it? > > > > Not really, no, I was just checking. I'm pretty sure that the fence is redundant, though. Right. This has mostly been an extra seat belt that seemed harmless when driving normally and potentially useful should there be a dangerous situation. If we think this seat belt is redundant because we already got an air bag, and it makes us feel uncomfortable to wear seat belts unnecessarily, then sure we can remove it. I don't mind. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3549977757 From dlong at openjdk.org Wed Nov 19 00:47:18 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 19 Nov 2025 00:47:18 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 20:12:20 GMT, Matias Saavedra Silva wrote: > The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. src/hotspot/share/ci/ciMethodData.cpp line 557: > 555: mdo->set_arg_stack(_arg_stack); > 556: mdo->set_arg_returned(_arg_returned); > 557: int arg_count = mdo->method()->number_of_parameters(); The actual size allocated seems to be based on the argument size in slots, not count. https://github.com/openjdk/jdk/blob/902aa4dcd297fef34cb302e468b030c48665ec84/src/hotspot/share/oops/methodData.cpp#L1359-L1360 To avoid any confusion, consider using the limit from ArgInfoData::number_of_args(), which would be better named size_of_args(). src/hotspot/share/prims/whitebox.cpp line 1272: > 1270: mdo->init(); > 1271: ResourceMark rm(THREAD); > 1272: int arg_count = mdo->method()->number_of_parameters(); See my comment for ciMethodData::update_escape_info(). Same issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2540064259 PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2540067055 From dlong at openjdk.org Wed Nov 19 01:42:03 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 19 Nov 2025 01:42:03 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 In-Reply-To: References: Message-ID: <5Z9vlKxZLXWCQR6MmvxJF16gqSblYGxMeRZPQjVeNOw=.3fc4f163-533e-4bed-8666-b82de2c3f7e4@github.com> On Thu, 13 Nov 2025 20:41:56 GMT, Dhamoder Nalla wrote: > This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. > Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). If it was explained somewhere why Math.log is 10% slower than StrictMath.log, I missed it. I would naively assume that Math.log can use the StrictMath.log implementation and get the same performance. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28306#issuecomment-3550193135 From darcy at openjdk.org Wed Nov 19 02:10:22 2025 From: darcy at openjdk.org (Joe Darcy) Date: Wed, 19 Nov 2025 02:10:22 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 In-Reply-To: <5Z9vlKxZLXWCQR6MmvxJF16gqSblYGxMeRZPQjVeNOw=.3fc4f163-533e-4bed-8666-b82de2c3f7e4@github.com> References: <5Z9vlKxZLXWCQR6MmvxJF16gqSblYGxMeRZPQjVeNOw=.3fc4f163-533e-4bed-8666-b82de2c3f7e4@github.com> Message-ID: On Wed, 19 Nov 2025 01:38:40 GMT, Dean Long wrote: > If it was explained somewhere why Math.log is 10% slower than StrictMath.log, I missed it. I would naively assume that Math.log can use the StrictMath.log implementation and get the same performance. Yes, the methods in StrictMath are a legal implementation of the corresponding methods in Math. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28306#issuecomment-3550343344 From fyang at openjdk.org Wed Nov 19 04:05:17 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 19 Nov 2025 04:05:17 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC [v2] In-Reply-To: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: > Hi, please consider this riscv-specific change. > > I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63: > > `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` > > The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. > I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. > This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. > > After this change, the log on BPI-F3 SBC looks like: > > $ java -Xlog:all -version > > ...... > [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. > [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. > [0.011s][info][os,cpu ] Enabled RV64 feature "a" > [0.011s][info][os,cpu ] Enabled RV64 feature "c" > [0.011s][info][os,cpu ] Enabled RV64 feature "d" > [0.011s][info][os,cpu ] Enabled RV64 feature "f" > [0.011s][info][os,cpu ] Enabled RV64 feature "i" > [0.011s][info][os,cpu ] Enabled RV64 feature "m" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" > [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) > [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) > [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) > [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) > [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) > [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) > [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) > [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. > [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin > ...... Fei Yang has updated the pull request incrementally with one additional commit since the last revision: Review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28340/files - new: https://git.openjdk.org/jdk/pull/28340/files/75d0d60c..5c7c1255 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28340&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28340&range=00-01 Stats: 13 lines in 2 files changed: 0 ins; 8 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28340.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28340/head:pull/28340 PR: https://git.openjdk.org/jdk/pull/28340 From fyang at openjdk.org Wed Nov 19 04:05:19 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 19 Nov 2025 04:05:19 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC [v2] In-Reply-To: <-qb8rWPJSXl2dI-tbY1W35-w-cj18RFmYYyC5PaPdF4=.90c7c9fc-5d5d-4fae-b0ee-ea9ce487e50f@github.com> References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> <-qb8rWPJSXl2dI-tbY1W35-w-cj18RFmYYyC5PaPdF4=.90c7c9fc-5d5d-4fae-b0ee-ea9ce487e50f@github.com> Message-ID: On Tue, 18 Nov 2025 15:03:52 GMT, Hamlin Li wrote: >> Fei Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> Review > > src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 110: > >> 108: } >> 109: >> 110: void VM_Version::RVExtFeatureValue::log_disabled(const char* reason) { > > Seems we don't need to distinguish whether `reason` is nullptr or not? as `log_disabled` is called only when it's disabled because of depenency check failure. Make sense. Removed the nullptr check. > src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 122: > >> 120: } >> 121: >> 122: void VM_Version::RVNonExtFeatureValue::log_disabled(const char* reason) { > > Is `log_disabled` in RVNonExtFeatureValue invoked in some condition? No. But like `virtual void log_enabled() = 0;`, `log_disabled` is also a pure virtual function in the base class. So it has to be implemented in all the subclasses. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28340#discussion_r2540438067 PR Review Comment: https://git.openjdk.org/jdk/pull/28340#discussion_r2540440309 From dlong at openjdk.org Wed Nov 19 04:20:44 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 19 Nov 2025 04:20:44 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v3] In-Reply-To: References: Message-ID: <2VkzhmbwgWzKma52fXgAHF-zi2Asd54tED_c57K8WOk=.6ee5549b-2136-489d-a71b-ba468b1e1de0@github.com> On Tue, 18 Nov 2025 22:35:38 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Replace `guarantee` with `assert` in the C1 `emit_deopt_handler` src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp line 539: > 537: // the return address points to the deopt handler stub code entry point which could be > 538: // at the end of page. > 539: first_check_size = 4 Suggestion: first_check_size = instruction_size ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28192#discussion_r2540463385 From dlong at openjdk.org Wed Nov 19 04:27:09 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 19 Nov 2025 04:27:09 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v3] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 22:35:38 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Replace `guarantee` with `assert` in the C1 `emit_deopt_handler` src/hotspot/cpu/s390/nativeInst_s390.hpp line 657: > 655: // code entry point, then it has to happen in two stages - to prevent out of bounds access > 656: // in case the return address points to the entry point which could be at the end of page. > 657: first_check_size = 0 Suggestion: first_check_size = 0 // check is unimplemented ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28192#discussion_r2540471158 From dlong at openjdk.org Wed Nov 19 04:31:41 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 19 Nov 2025 04:31:41 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: References: <-h6G9ajUWQwDRcUMOtyI_YCUCkXz3pzRggJk_UaxM-0=.a8c772aa-2f09-48c0-9cfb-17e624393eb0@github.com> Message-ID: On Mon, 17 Nov 2025 12:10:45 GMT, Ruben wrote: > Is there any additional testing you would recommend to perform before this can be integrated? Oracle likes to make sure the final version passes in our CI. I got burned last time testing an earlier version and not the final version. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3550665267 From dholmes at openjdk.org Wed Nov 19 05:29:34 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 19 Nov 2025 05:29:34 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic [v2] In-Reply-To: References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: On Tue, 18 Nov 2025 10:48:57 GMT, Kim Barrett wrote: >> Please review this change to the `LockFreeStack` utility to allow clients to >> use `Atomic` as the type of the "next" member used in the linked-list >> representation of the stack. It also continues to allow clients to use the old >> (pre-`Atomic`) form where the "next" member is volatile. This allows >> clients to be updated incrementally after this change, rather than requiring >> all clients to be updated in conjunction with the update of this class. Once >> all clients have been updated, support for the old form can be removed. >> >> The associated gtests have been updated to use `Atomic`, with testing of >> the old form is no longer being done. The non-updated uses provide some >> testing, and that's all expected to go away soon. So parameterizing the gtests >> for both forms seems like a bunch of work that will just be deleted soon, with >> very little benefit. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into lock-free-stack-allows-new-atomic > - rename next_access to next_accessor > - LockFreeStack supports Atomic Overall looks good. Thanks. > This allows clients to be updated incrementally after this change, rather than requiring all clients to be updated in conjunction with the update of this class. @kimbarrett I only see four clients of this code: ./share/compiler/oopMap.cpp: typedef LockFreeStack List; ./share/gc/g1/g1MonotonicArena.hpp: using SegmentStack = LockFreeStack; ./share/gc/shared/freeListAllocator.hpp: typedef LockFreeStack Stack; ./share/gc/shared/bufferNode.hpp: typedef LockFreeStack Stack; is the required update so disruptive that you can't just do it here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28329#issuecomment-3550868212 From kbarrett at openjdk.org Wed Nov 19 07:24:17 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 07:24:17 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic [v2] In-Reply-To: References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: On Tue, 18 Nov 2025 10:48:57 GMT, Kim Barrett wrote: >> Please review this change to the `LockFreeStack` utility to allow clients to >> use `Atomic` as the type of the "next" member used in the linked-list >> representation of the stack. It also continues to allow clients to use the old >> (pre-`Atomic`) form where the "next" member is volatile. This allows >> clients to be updated incrementally after this change, rather than requiring >> all clients to be updated in conjunction with the update of this class. Once >> all clients have been updated, support for the old form can be removed. >> >> The associated gtests have been updated to use `Atomic`, with testing of >> the old form is no longer being done. The non-updated uses provide some >> testing, and that's all expected to go away soon. So parameterizing the gtests >> for both forms seems like a bunch of work that will just be deleted soon, with >> very little benefit. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into lock-free-stack-allows-new-atomic > - rename next_access to next_accessor > - LockFreeStack supports Atomic > Overall looks good. Thanks. > > > This allows clients to be updated incrementally after this change, rather than requiring all clients to be updated in conjunction with the update of this class. > > @kimbarrett I only see four clients of this code: > [...] > is the required update so disruptive that you can't just do it here? There are fewer clients than I remembered. I think there were at least one or two more in the recently replaced G1 post-barrier code. But at least some of the current clients have a number of other atomic uses. I'd rather not partially update them to just deal with their use of this class. And I'd rather not completely do all of them in one change set. The code to do the conditionalization in LockFreeStack is pretty small, simple, and localized. (There was an earlier version that was also supporting NonblockingQueue, but that class was recently removed because it is no longer used (again by the replacement of the G1 post-barrier code), rather than updating its atomic usage. That version of the conditionalization involved somewhat more code.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/28329#issuecomment-3551169359 From dholmes at openjdk.org Wed Nov 19 09:08:34 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 19 Nov 2025 09:08:34 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic [v2] In-Reply-To: References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: On Tue, 18 Nov 2025 10:48:57 GMT, Kim Barrett wrote: >> Please review this change to the `LockFreeStack` utility to allow clients to >> use `Atomic` as the type of the "next" member used in the linked-list >> representation of the stack. It also continues to allow clients to use the old >> (pre-`Atomic`) form where the "next" member is volatile. This allows >> clients to be updated incrementally after this change, rather than requiring >> all clients to be updated in conjunction with the update of this class. Once >> all clients have been updated, support for the old form can be removed. >> >> The associated gtests have been updated to use `Atomic`, with testing of >> the old form is no longer being done. The non-updated uses provide some >> testing, and that's all expected to go away soon. So parameterizing the gtests >> for both forms seems like a bunch of work that will just be deleted soon, with >> very little benefit. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into lock-free-stack-allows-new-atomic > - rename next_access to next_accessor > - LockFreeStack supports Atomic Looks good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28329#pullrequestreview-3481585739 From mdoerr at openjdk.org Wed Nov 19 09:15:51 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 19 Nov 2025 09:15:51 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v3] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 22:35:38 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Replace `guarantee` with `assert` in the C1 `emit_deopt_handler` Marked as reviewed by mdoerr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28192#pullrequestreview-3481619005 From adinn at openjdk.org Wed Nov 19 09:33:52 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Wed, 19 Nov 2025 09:33:52 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v2] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 14:15:58 GMT, Ruben wrote: >> Our tests haven't revealed any new issues related to this PR. > > Thank you, @TheRealMDoerr, > >> I think assertions would be sufficient in C1 instead of guarantee. > > Sure, I will change these to assertions. @ruben-arm I'm ok with this version assuming it passes tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3551720864 From mli at openjdk.org Wed Nov 19 09:49:13 2025 From: mli at openjdk.org (Hamlin Li) Date: Wed, 19 Nov 2025 09:49:13 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC [v2] In-Reply-To: References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> <-qb8rWPJSXl2dI-tbY1W35-w-cj18RFmYYyC5PaPdF4=.90c7c9fc-5d5d-4fae-b0ee-ea9ce487e50f@github.com> Message-ID: On Wed, 19 Nov 2025 04:00:28 GMT, Fei Yang wrote: >> src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 122: >> >>> 120: } >>> 121: >>> 122: void VM_Version::RVNonExtFeatureValue::log_disabled(const char* reason) { >> >> Is `log_disabled` in RVNonExtFeatureValue invoked in some condition? > > No. But like `virtual void log_enabled() = 0;`, `log_disabled` is also a pure virtual function in the base class. So it has to be implemented in all the subclasses. As `log_disabled` is only called in `UPDATE_DEFAULT_DEP`, which is only called in subclasses of `RVExtFeatureValue`, so I think it's OK to put `log_disabled` under `RVExtFeatureValue`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28340#discussion_r2541284249 From shade at openjdk.org Wed Nov 19 12:14:48 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 19 Nov 2025 12:14:48 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Sat, 15 Nov 2025 02:33:46 GMT, Dean Long wrote: >> ?rGenerator::generate_native_entry >> >> I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: >> >> >> // get native function entry point in r10 >> { >> Label L; >> __ ldr(r10, Address(rmethod, Method::native_function_offset())); >> ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); >> __ lea(rscratch2, unsatisfied); >> __ ldr(rscratch2, rscratch2); >> __ cmp(r10, rscratch2); >> __ br(Assembler::NE, L); >> __ call_VM(noreg, >> CAST_FROM_FN_PTR(address, >> InterpreterRuntime::prepare_native_call), >> rmethod); >> __ get_method(rmethod); >> __ ldr(r10, Address(rmethod, Method::native_function_offset())); >> __ bind(L); >> } >> >> >> If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. >> >> This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. >> >> This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. >> >> Updated comment with markdown for code. > > According to InterpreterRuntime::prepare_native_call(), if there is a signal handler, which is checked first, then there should be a native function. So I wonder if we can remove the check for the native function from all CPU ports. I did also wonder how it is not breaking now with uninitialized native entries. But I see we are doing this init as part of signature handler resolution: https://github.com/openjdk/jdk/blob/54893dc5c2a4702896029b1844bc9496325c8f26/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1323-L1334 -- before we hit this block. So, as @dean-long says, maybe we do not need this native method entry check at all. But this fix is fine to unbreak BSD/AArch64 alone. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3552366023 From eastigeevich at openjdk.org Wed Nov 19 12:15:23 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 19 Nov 2025 12:15:23 GMT Subject: Integrated: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 21:35:42 GMT, Evgeny Astigeevich wrote: > The instruction cache maintenance function internally handles any required barriers. > This means we don't need any barriers before calling it. > This PR removes a redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation. This pull request has now been integrated. Changeset: d2926dfd Author: Evgeny Astigeevich URL: https://git.openjdk.org/jdk/commit/d2926dfd9a242928877d0b1e40eac498073975bd Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation Reviewed-by: aph ------------- PR: https://git.openjdk.org/jdk/pull/28244 From kurt at openjdk.org Wed Nov 19 12:17:34 2025 From: kurt at openjdk.org (Kurt Miller) Date: Wed, 19 Nov 2025 12:17:34 GMT Subject: Integrated: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Fri, 14 Nov 2025 17:41:56 GMT, Kurt Miller wrote: > ?rGenerator::generate_native_entry > > I believe there's a incorrect pointer deference in `TemplateInterpreterGenerator::generate_native_entry()` in this part of the code: > > > // get native function entry point in r10 > { > Label L; > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > ExternalAddress unsatisfied(SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); > __ lea(rscratch2, unsatisfied); > __ ldr(rscratch2, rscratch2); > __ cmp(r10, rscratch2); > __ br(Assembler::NE, L); > __ call_VM(noreg, > CAST_FROM_FN_PTR(address, > InterpreterRuntime::prepare_native_call), > rmethod); > __ get_method(rmethod); > __ ldr(r10, Address(rmethod, Method::native_function_offset())); > __ bind(L); > } > > > If I understand this correctly, the entry point for unsatisfied link error is loaded into `rscratch2`. The next instruction, `ldr(rscratch2, rscratch2)`, dereferences that pointer and reads from the text segment the initial instructions at the entry point into `rscratch2`. It then compares the native method entry point in `r10` with the initial instructions loaded into `rscratch2` which will never match. I believe the intent here was to compare the native method entry point with the unsatisfied link error entry point and the `ldr(rscratch2, rscratch2)` instruction should be removed. > > This was found on OpenBSD/aarch64. OpenBSD has a security feature where the text segments are marked execute only and do not allow reads independent of execution. the` ldr(rscratch2, rscratch2)` instruction causes a segfault because it is reading the text segment. While this bug was found on OpenBSD I believe it applies to all OS on aaarch64. > > This change removes the errant aarch64 hotspot assembly instruction that was reading from libjvm.so .text segment. > > Updated comment with markdown for code. This pull request has now been integrated. Changeset: ae4d9c2e Author: Kurt Miller Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/ae4d9c2e6af0b899481c98742f4976c7769f39e5 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry Reviewed-by: aph, shade ------------- PR: https://git.openjdk.org/jdk/pull/28327 From coleenp at openjdk.org Wed Nov 19 12:34:30 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 19 Nov 2025 12:34:30 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v3] In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert a couple more InstanceKlass::casts also to get GHA to restart. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28371/files - new: https://git.openjdk.org/jdk/pull/28371/files/e8973f59..1060463b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From duke at openjdk.org Wed Nov 19 12:47:18 2025 From: duke at openjdk.org (Samuel Chee) Date: Wed, 19 Nov 2025 12:47:18 GMT Subject: RFR: 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile field loads [v4] In-Reply-To: References: Message-ID: > Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR for volatile field loads - for example, AtomicLong::get. > > This is valid, as originally the DMBs were necessary due to the case described here - https://bugs.openjdk.org/browse/JDK-8179954. As in the rare case where the LD can be reordered with an LDAR or STLR from the C2 implementation for stores and loads, these DMBs are required. > However, acquire/release operations use a sequentially consistent model which does not allow reordering between them. Hence, the LD can be replaced with an LDAR to disallow reordering with a STLR/LDAR and the first DMB can be removed. > > The LDAR has acquire semantics, so it's impossible for memory accesses after to be reordered before; the DMB ISHLD is not required. Therefore, a singular LDAR is sufficient. Samuel Chee has updated the pull request incrementally with one additional commit since the last revision: Rename load_generic -> load_relaxed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26748/files - new: https://git.openjdk.org/jdk/pull/26748/files/88b8a2d3..15b6bf35 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26748&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26748&range=02-03 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26748.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26748/head:pull/26748 PR: https://git.openjdk.org/jdk/pull/26748 From duke at openjdk.org Wed Nov 19 12:47:23 2025 From: duke at openjdk.org (Ruben) Date: Wed, 19 Nov 2025 12:47:23 GMT Subject: RFR: 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile field loads [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 11:27:08 GMT, Andrew Haley wrote: >> Samuel Chee has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Address review comments. Refine. >> >> Change-Id: I9cc0308300548c1892d39791e00b41ef13c95e63 >> - Merge from the main branch >> - Address review comments >> >> Change-Id: Ica13be8094ac0f057066042ef0a5ec5927b98dfd >> - Refine code generation for mem2reg_volatile >> >> The patch is contributed by @theRealAph. >> >> Change-Id: I7ab1854dd238cdce72a4ab218b5b4ee84ad39586 >> - 8365147: AArch64: Replace DMB + LD + DMB with LDAR for C1 volatile loads >> >> Replaces the DMB ISH + LD + DMB ISHLD sequence with LDAR >> for volatile field loads - for example, AtomicLong::get. >> >> This is valid, as originally the DMBs were necessary due to >> the case described here - https://bugs.openjdk.org/browse/JDK-8179954. >> As in the rare case where the LD can be reordered with an LDAR >> or STLR from the C2 implementation for stores and loads, these >> DMBs are required. >> However, acquire/release operations use a sequentially consistent model >> which does not allow reordering between them. Hence, the LD can be >> replaced with an LDAR to disallow reordering with a STLR/LDAR >> and the first DMB can be removed. >> >> The LDAR has acquire semantics, so it's impossible for >> memory accesses after to be reordered before; the DMB ISHLD is >> not required. Therefore, a singular LDAR is sufficient. >> >> This excludes floats and doubles, as they do not have >> equivalent load-acquire instructions. >> >> Change-Id: Ia93607f8bb20c2d974fe6b2e586dd3239bb2728c > > src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp line 948: > >> 946: } >> 947: >> 948: void LIR_Assembler::load_generic(LIR_Address *from_addr, LIR_Opr dest, > > Suggestion: > > void LIR_Assembler::load_relaxed(LIR_Address *from_addr, LIR_Opr dest, > > Standard terminology. Thanks @theRealAph. Updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26748#discussion_r2541853251 From azafari at openjdk.org Wed Nov 19 13:31:15 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 19 Nov 2025 13:31:15 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v11] In-Reply-To: <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> Message-ID: On Mon, 17 Nov 2025 01:16:31 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fix arguments.cpp for HeapMinBaseAddress type. > > src/hotspot/share/memory/memoryReserver.cpp line 549: > >> 547: const size_t attach_point_alignment = lcm(alignment, os_attach_point_alignment); >> 548: >> 549: uintptr_t aligned_heap_base_min_address = align_up(MAX2(HeapBaseMinAddress, alignment), alignment); > > Just to be clear, this is the crux of the fix, where we ensure the min-address is now never zero - right? Right. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2542017385 From azafari at openjdk.org Wed Nov 19 13:48:21 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 19 Nov 2025 13:48:21 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v11] In-Reply-To: <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> Message-ID: On Mon, 17 Nov 2025 01:18:11 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fix arguments.cpp for HeapMinBaseAddress type. > > src/hotspot/share/memory/memoryReserver.cpp line 586: > >> 584: lowest_start, highest_start); >> 585: reserved = try_reserve_range((char*)highest_start, (char*)lowest_start, attach_point_alignment, >> 586: (char*)aligned_heap_base_min_address, (char*)UnscaledOopHeapMax, size, alignment, page_size); > > Not obvious to me this actually improves anything - what is it fixing? First, the pointer arithmetics are done on `uintptr_t` types to avoid UB. Second, it is checked that `lowest` and `highest` are still valid after becoming larger or smaller, respectively. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2542076686 From alanb at openjdk.org Wed Nov 19 13:54:03 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 19 Nov 2025 13:54:03 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 20:19:58 GMT, Patricio Chilano Mateo wrote: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence would be to place extra overhead on the thread requesting to disable transitions (e.g. by usi ng a safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so I believe this approach is simpler. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and unmount cases, and a... src/java.base/share/classes/java/lang/VirtualThread.java line 1390: > 1388: } > 1389: > 1390: // -- JVM TI support -- We'll need to update is comment as it no longer only for JVMTI. This might be a good place for a block comment to define "transitions" covering the changing of thread identity the continuation mount/unmount, and how the notification to the VM support JVMTI and handshakes. Maybe I could contribute a block comment to include here? src/java.base/share/native/libjava/VirtualThread.c line 38: > 36: { "startFinalTransition", "()V", (void *)&JVM_VirtualThreadEnd }, > 37: { "startTransition", "(Z)V", (void *)&JVM_VirtualThreadStartTransition }, > 38: { "endTransition", "(Z)V", (void *)&JVM_VirtualThreadEndTransition }, I wonder if JVM_VirtualThreadStart and JVM_VirtualThreadEnd should be renamed to have EndFirstTransition and StartFinalTransaction in the names so it's easy to follow through from the Java code down to MountUnmountDisabler::start_transition/end_transition. test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWhenParking.java line 94: > 92: }); > 93: } > 94: // wait for all virtual threads to start so all have a non-empty stack This reminds me the loom repo has a small update to to the DumpThreadsWithEliminatedLock.java test to ensure that the virtual thread starts execution before doing the thread dump. This was noticed with test-repeat runs of the new test to ensure it was stable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2542097138 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2542016761 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2542034248 From azafari at openjdk.org Wed Nov 19 13:56:04 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 19 Nov 2025 13:56:04 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v11] In-Reply-To: <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> Message-ID: On Mon, 17 Nov 2025 01:21:52 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fix arguments.cpp for HeapMinBaseAddress type. > > src/hotspot/share/memory/memoryReserver.cpp line 590: > >> 588: >> 589: // zerobased: Attempt to allocate in the lower 32G. >> 590: size_t zerobased_max = OopEncodingHeapMax; > > Again not obvious what this improves. We obviously have very inconsistent use of types here in that we loosely use `char*`, `uint64_t` and `size_t` to all mean a 64-bit unsigned value, ansd no matter what types we use in the declarations we have to cast something somewhere. According to reviewers' suggestions, the pointers used in arithmeitc are typed as numeric like `size_t` or `uintptr_t`. And only when they are going to be passed as pointers to other functions, they will be cast to the desired pointers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2542089700 From aartemov at openjdk.org Wed Nov 19 14:09:53 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Wed, 19 Nov 2025 14:09:53 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v7] In-Reply-To: References: Message-ID: > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. > > Tested in tiers 1 - 5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8366671: Addressed reviewer's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28264/files - new: https://git.openjdk.org/jdk/pull/28264/files/e9866cdf..dcc0a9b3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=05-06 Stats: 72 lines in 5 files changed: 8 ins; 52 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From aartemov at openjdk.org Wed Nov 19 14:09:55 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Wed, 19 Nov 2025 14:09:55 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 14:53:42 GMT, Patricio Chilano Mateo wrote: >> +1 on leaving the code as it was. This makes it harder to read IMO. Also as David points out there is no spinning here so using a SpinSingleSection class would add extra confusion. > >>This seems like try-lock usage, though I'm not at all sure why (i.e. if we don't get the lock who is taking care of making a strong reference?) >> > This lock is only used here, so if the try-lock fails somebody else already grabbed it and that thread will create the strong reference (if not created already). Agreed. That SpinSingleSection is removed, the code in the ObjectMonitor is restored. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542156074 From aartemov at openjdk.org Wed Nov 19 14:10:10 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Wed, 19 Nov 2025 14:10:10 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 22:04:30 GMT, Kim Barrett wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Removed redundant include. > > src/hotspot/share/runtime/park.cpp line 66: > >> 64: { >> 65: SpinCriticalSection scs(&ListLock); >> 66: { > > This extra level of scoping doesn't seem needed. Addressed. > src/hotspot/share/utilities/spinCriticalSection.cpp line 38: > >> 36: } >> 37: >> 38: // Slow-path : We've encountered contention -- Spin/Yield/Block strategy. > > Use "utilities/spinYield.hpp"? This is a good suggestion, `SpinYield` has slightly different behavior: it does not return to `SpinPause()` after exceeding the spin limit, whereas the current code does. It looks like the behavior `SpinYield` is what one needs, because if a thread failed to grab a lock while spinning, why to spin again? The performance testing is being done right now. > src/hotspot/share/utilities/spinCriticalSection.hpp line 30: > >> 28: #include "runtime/javaThread.hpp" >> 29: >> 30: class SpinCriticalSectionHelper { > > Derive from `AllStatic` ("memory/allStatic.hpp"). Done. > src/hotspot/share/utilities/spinCriticalSection.hpp line 42: > >> 40: >> 41: // Short critical section. To be used when having a >> 42: // mutex is considered to be expensive. > > I think this is a really poor description, as I think it will encourage the use of these facilities in > inappropriate places. Spin-lock usage ought to be pretty rare! Note that the existing mechanism > is described as "Not for general synchronization use." I think better motivation is needed. Note > I'm not suggesting that doesn't exist, rather than motivation and usage guidelines should be > documented here. The comment for SpinCriticalSectionHelper in the .cpp file is more the kind > of thing I'm looking for. I shouldn't have to look at that internal helper's implementation to find > such guidance. Right. I moved that comment over to the header file and modified it a bit. > src/hotspot/share/utilities/spinCriticalSection.hpp line 45: > >> 43: class SpinCriticalSection { >> 44: private: >> 45: volatile int* const _lock; > > Use new `Atomic` rather than introducing new direct uses of `AtomicAccess`. This will require somewhat extensive changes in JFR, as the are using the same thing for their JfrTryLock. > src/hotspot/share/utilities/spinCriticalSection.hpp line 53: > >> 51: SpinCriticalSectionHelper::spin_release(_lock); >> 52: } >> 53: }; > > Should be noncopyable. I made `SpinCriticalSection` non-copyable. > src/hotspot/share/utilities/spinCriticalSection.hpp line 55: > >> 53: }; >> 54: >> 55: template > > I'd prefer not to name the first argument "Lambda", since it might not be one. > I would prefer `F` or `Fn` or something like that. And there should be some documentation > for this class, including a description of the template parameters and their requirements. I removed SpinSingleSection completely, it is an overkill. > src/hotspot/share/utilities/spinCriticalSection.hpp line 56: > >> 54: >> 55: template >> 56: class SpinSingleSection { > > Consider giving this class template a deduction guide. That will likely make uses _much_ simpler, > removing the explicit template parameters in variable declarations and just letting them be > deduced from the constructor argument types. I removed SpinSingleSection completely. > src/hotspot/share/utilities/spinCriticalSection.hpp line 58: > >> 56: class SpinSingleSection { >> 57: private: >> 58: volatile int* const _lock; > > Why an `int`-type value for the lock, rather than `bool`? I know why, but it should probably be > stated explicitly, else someone might be tempted to change it in the future. I believe it is for historical reasons? > src/hotspot/share/utilities/spinCriticalSection.hpp line 61: > >> 59: Thread* _lock_owner; >> 60: public: >> 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { > > `F` => `f` - variables have lower-case names. I removed SpinSingleSection completely. > src/hotspot/share/utilities/spinCriticalSection.hpp line 61: > >> 59: Thread* _lock_owner; >> 60: public: >> 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { > > I really need to get on the ball and update the style guide regarding at least forwarding references > and `std::forward`. I removed SpinSingleSection completely. > src/hotspot/share/utilities/spinCriticalSection.hpp line 61: > >> 59: Thread* _lock_owner; >> 60: public: >> 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { > > Taking the function argument by reference prevents certain common use-cases, e.g. I think this > prevents passing an anonymous lambda. I removed SpinSingleSection completely. > src/hotspot/share/utilities/spinCriticalSection.hpp line 63: > >> 61: SpinSingleSection(volatile int* lock, Lambda& F, Args&... args) : _lock(lock), _lock_owner(nullptr) { >> 62: if (SpinCriticalSectionHelper::try_spin_acquire(_lock)) { >> 63: _lock_owner = Thread::current(); > > Why do we need the owning thread here? It seems like a bool "lock acquired" value > would be sufficient. I removed SpinSingleSection. > src/hotspot/share/utilities/spinCriticalSection.hpp line 75: > >> 73: SpinCriticalSectionHelper::spin_release(_lock); >> 74: } >> 75: } > > Should be noncopyable. Done. > src/hotspot/share/utilities/spinCriticalSection.hpp line 77: > >> 75: } >> 76: }; >> 77: #endif //SHARE_UTILITIES_SPINCRITICALSECTION_HPP > > We usually put a blank line before the `#endif` of the include guard. Also a space after `//`. Addressed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542153334 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542152204 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542154027 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542150954 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542148147 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542154310 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542149515 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542149897 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542152590 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542150246 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542150580 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542152983 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542154871 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542154548 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542153781 From aartemov at openjdk.org Wed Nov 19 14:10:12 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Wed, 19 Nov 2025 14:10:12 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 00:21:28 GMT, Kim Barrett wrote: >> src/hotspot/share/utilities/spinCriticalSection.hpp line 56: >> >>> 54: >>> 55: template >>> 56: class SpinSingleSection { >> >> Although I've made some suggestions for possibly improving SpinSingleSection, >> I'm not sure it's a good idea as a concept. It seems to be attempting to >> provide a conditional critical section, but is doing so in what seems to me to >> be a weird way. >> >> As provided, it first conditionally executes a funarg under the >> lock, if it can acquire the lock. It then permits an external body (the scope >> of the section) to execute either under or not under the lock (depending on >> whether it was successfully acquired), with no way to know which state we're >> in. >> >> I think an API more similar to `std::unique_lock` for SpinCriticalSection >> would be better. `std::unique_lock` provides a `owns_lock()` function and a >> constructor overload taking a `std::try_to_lock_t` value. This controls >> whether the locking should be conditional or not, and a way for the using code >> to detect success/failure to lock in the conditional case. This doesn't have >> to be in one class though. There could be two critical section classes, one >> unconditional and one conditional, with only the latter providing the >> success/failure info. > > Or maybe just not bother with special help for the currently one(?) use-case for this, and instead > have that use-case directly use `try_acquire` with a local RAII object to ensure release in the > acquired case. Or a local bespoke helper class, or something along those lines. I agree that that this single critical section is an overkill and it does have only once use-case. And the implementation could be better. The agreement is that we don't need it. I have removed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542155560 From aartemov at openjdk.org Wed Nov 19 14:10:14 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Wed, 19 Nov 2025 14:10:14 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v4] In-Reply-To: References: Message-ID: <3uYk42EGfDwWRifEst9LSlnwyEnImnsllMQQTiD2X-s=.3daa5a0c-08c7-4fd5-8d77-1bee3a29d617@github.com> On Mon, 17 Nov 2025 22:10:53 GMT, Kim Barrett wrote: >> For the same reason why `jfrSpinlockHelper.hpp` was included. >> >> It looks like the two includes above that are redundant and can be removed. This one cannot, it breaks builds. > > Include of "atomicAccess.hpp" seems unnecessary, as there are no (direct) uses here. > "globalDefinitions.hpp" should not be removed, under the "Include What You Use" guidance > (which hasn't yet made it into the Style Guide - https://bugs.openjdk.org/browse/JDK-8252896). I removed "atomicAccess.hpp". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2542147468 From kurt at openjdk.org Wed Nov 19 14:22:37 2025 From: kurt at openjdk.org (Kurt Miller) Date: Wed, 19 Nov 2025 14:22:37 GMT Subject: RFR: 8371918: aarch64: Incorrect pointer dereference in TemplateInterpreterGenerator::generate_native_entry In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 12:12:37 GMT, Aleksey Shipilev wrote: >> According to InterpreterRuntime::prepare_native_call(), if there is a signal handler, which is checked first, then there should be a native function. So I wonder if we can remove the check for the native function from all CPU ports. > > I did also wonder how it is not breaking now with uninitialized native entries. But I see we are doing this init as part of signature handler resolution: https://github.com/openjdk/jdk/blob/54893dc5c2a4702896029b1844bc9496325c8f26/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1323-L1334 -- before we hit this block. So, as @dean-long says, maybe we do not need this native method entry check at all. But this fix is fine to unbreak BSD/AArch64 alone. @shipilev Thank you for the review and sponsor. I was wondering why this issue had not been discovered previously. It does appear that the check against unsatisfied link may be unnecessary. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28327#issuecomment-3552963583 From jsjolen at openjdk.org Wed Nov 19 14:28:37 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 19 Nov 2025 14:28:37 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v16] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 12:42:01 GMT, Coleen Phillimore wrote: >> Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: >> >> - Merge remote-tracking branch 'openjdk/master' into operands-again >> - It's fine to initialize the iterator with null, it's not fine to reserve an entry if it's null >> - Fix naming >> - Serguei comments >> - Revert change >> - Some nits >> - Fix copyright >> - Move BSMAttribute BSMAttributeEntries to own header file >> - Merge remote-tracking branch 'origin/operands-again' into operands-again >> - Apply suggestions from code review >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> >> - ... and 21 more: https://git.openjdk.org/jdk/compare/c1908450...57f0093e > > src/hotspot/share/oops/bsmAttribute.hpp line 28: > >> 26: #define SHARE_OOPS_BSMATTRIBUTE_HPP >> 27: >> 28: #include "classfile/classLoaderData.hpp" > > I think you can forward declare ClassLoaderData rather than include the whole file here. Ooh, you're right about that! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2542245468 From jsjolen at openjdk.org Wed Nov 19 14:33:19 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 19 Nov 2025 14:33:19 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v17] In-Reply-To: References: Message-ID: <6CQmVFTykatS4KDf9yWPGhLS8a_acPcW9Ds4Y6Utbuw=.4d3d889f-46e9-4766-9680-26b8b162746f@github.com> > Hi, > > This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. > > We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. > > For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. > > On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. > > Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27198/files - new: https://git.openjdk.org/jdk/pull/27198/files/57f0093e..3686fc34 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=15-16 Stats: 8 lines in 3 files changed: 2 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/27198.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27198/head:pull/27198 PR: https://git.openjdk.org/jdk/pull/27198 From jsjolen at openjdk.org Wed Nov 19 14:33:21 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 19 Nov 2025 14:33:21 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v12] In-Reply-To: <8WVVrT5cKKUY1wGnTvxzj-8FFM-dZnYtuActIQRXZUQ=.0f11587c-e06f-47ee-93e4-bd7a5e7fc16f@github.com> References: <_0HzhdWbRBZNJvB33qf8VXRnc70eYXm7NCmb6oSEllw=.482f6b91-c612-4be7-a007-29954f0f5080@github.com> <8WVVrT5cKKUY1wGnTvxzj-8FFM-dZnYtuActIQRXZUQ=.0f11587c-e06f-47ee-93e4-bd7a5e7fc16f@github.com> Message-ID: On Wed, 5 Nov 2025 12:46:04 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/bsmAttribute.inline.hpp line 34: >> >>> 32: _cur_array + BSMAttributeEntry::u2s_required(argc) > insert_into->bootstrap_methods()->length()) { >>> 33: return nullptr; >>> 34: } >> >> Nit: This check needs a comment. Also, I'd suggest to add a guarantee here instead of returning `nullptr`. > > I agree with this comment - is returning null going to crash somewhere down the line? Is this an overflow? Returning null generally shouldn't happen. It indicates that we have supplied too little space for the entries. The reason I don't `guarantee` here is because this function doesn't have enough context to output a proper error message. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27198#discussion_r2542253926 From jsjolen at openjdk.org Wed Nov 19 14:38:56 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 19 Nov 2025 14:38:56 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v17] In-Reply-To: <6CQmVFTykatS4KDf9yWPGhLS8a_acPcW9Ds4Y6Utbuw=.4d3d889f-46e9-4766-9680-26b8b162746f@github.com> References: <6CQmVFTykatS4KDf9yWPGhLS8a_acPcW9Ds4Y6Utbuw=.4d3d889f-46e9-4766-9680-26b8b162746f@github.com> Message-ID: On Wed, 19 Nov 2025 14:33:19 GMT, Johan Sj?len wrote: >> Hi, >> >> This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. >> >> We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. >> >> For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. >> >> On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. >> >> Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Comments Hi! I'm back. I've done some benchmarking (DaCapo, etc) and found no discernible differences. I've also run Tier1-tier6 with the changes before 'Comments'. A total of 4 test failures, all seems highly unrelated to my changes so I'm not worried. I'd apprecaite a re-approval so I can integrate this :-). Thank you all for your efforts in making this PR happen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27198#issuecomment-3553042638 From bulasevich at openjdk.org Wed Nov 19 14:44:56 2025 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 19 Nov 2025 14:44:56 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 22:09:24 GMT, Ivan wrote: >> Migrate away from pointer-based representation of Register values. >> >> It improves compile-time checking by forbidding implicit conversions between integrals and pointers. >> >> [JDK-8363943](https://bugs.openjdk.org/browse/JDK-8363943) > > Ivan has updated the pull request incrementally with one additional commit since the last revision: > > Proposed review changes were applied Cool! We were relying on incidental layout :( I totally agree with the change - thanks for fixing this! ------------- Marked as reviewed by bulasevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/26525#pullrequestreview-3483117942 From matsaave at openjdk.org Wed Nov 19 15:53:23 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 19 Nov 2025 15:53:23 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: Message-ID: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> On Wed, 19 Nov 2025 00:42:54 GMT, Dean Long wrote: >> The method size_of_parameters() is sometimes used as if it represents the number of arguments rather than the size in bytes of the arguments. This patch changes some of these instances to the correct result or renames some of the variables to the desired result. Verified with tier 1-5 tests. > > src/hotspot/share/ci/ciMethodData.cpp line 557: > >> 555: mdo->set_arg_stack(_arg_stack); >> 556: mdo->set_arg_returned(_arg_returned); >> 557: int arg_count = mdo->method()->number_of_parameters(); > > The actual size allocated seems to be based on the argument size in slots, not count. > https://github.com/openjdk/jdk/blob/902aa4dcd297fef34cb302e468b030c48665ec84/src/hotspot/share/oops/methodData.cpp#L1359-L1360 > To avoid any confusion, consider using the limit from ArgInfoData::number_of_args(), which would be better named size_of_args(). So it looks like the use of "arg" here refers to "arg slot" meaning these variables and methods could be renamed to be more clear. What do you think of renaming methods like `arg_modified` to `arg_slot_modified`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2542604276 From jsikstro at openjdk.org Wed Nov 19 16:21:47 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 19 Nov 2025 16:21:47 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages Message-ID: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Hello, Today, Parallel decides to opt out of using Large pages if the initial heap size does not cover enough Large pages for all spaces. Additionally, if we don't get enough initial heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. By making sure the minimum heap size covers this, we never have to disable Large pages or run in a NUMA-degraded mode based on the setting of the initial heap size. For completeness, when user-proided settings for UseNUMA, UseLargePages and InitialHeapSize can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise InitialHeapSize over both UseNUMA and UseLargePages. This change suggest shifting the priority to UseNUMA and UseLargePages, by bumping MinHeapSize to an adequate number. Bumping MinHeapSize directly affects InitialHeapSize, since InitialHeapSize must be equal to or greater than MinHeapSize.
    Min and Initial heap sizes before/after (expandable section) Before changes. We always get Min&Initial 2MB that we request: java -XX:+UseParallelGC -Xms2M -Xmx1G Alignments: Space 512K, Heap 2M Heap Min Capacity: 2M Heap Initial Capacity: 2M java -XX:+UseParallelGC -XX:+UseLargePages -Xms2M -Xmx1G MinHeapSize (2097152) must be large enough for 4 * page-size; Disabling UseLargePages for heap Alignments: Space 512K, Heap 2M Heap Min Capacity: 2M Heap Initial Capacity: 2M java -XX:+UseParallelGC -XX:+UseNUMA -Xms2M -Xmx1G Alignments: Space 512K, Heap 2M Heap Min Capacity: 2M Heap Initial Capacity: 2M java -XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA -Xms2M -Xmx1G MinHeapSize (2097152) must be large enough for 4 * page-size; Disabling UseLargePages for heap Alignments: Space 512K, Heap 2M Heap Min Capacity: 2M Heap Initial Capacity: 2M After changes. We bump Min, and in turn also Initial, to accommodate enough Large Pages for all spaces. This is run on a NUMA machine with two NUMA nodes, so we get an extra 2MB when NUMA is enabled for the additional eden space. java -XX:+UseParallelGC -Xms2M -Xmx1G Alignments: Space 512K, Heap 2M Heap Min Capacity: 2M Heap Initial Capacity: 2M java -XX:+UseParallelGC -XX:+UseLargePages -Xms2M -Xmx1G Alignments: Space 2M, Heap 2M Heap Min Capacity: 8M Heap Initial Capacity: 8M java -XX:+UseParallelGC -XX:+UseNUMA -Xms2M -Xmx1G Alignments: Space 512K, Heap 2M Heap Min Capacity: 4M Heap Initial Capacity: 4M -XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA -Xms2M -Xmx1G Alignments: Space 2M, Heap 2M Heap Min Capacity: 10M Heap Initial Capacity: 10M
    Since the usage of `UseLargePages` may bump the Min and Initial heap sizes, I've opted to not run `TestParallelHeapSizeFlags` with `UseLargePages` enabled. Testing: * Oracle's tier1-4 * tier1-3 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` ------------- Commit messages: - 8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages Changes: https://git.openjdk.org/jdk/pull/28394/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28394&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372150 Stats: 83 lines in 9 files changed: 20 ins; 52 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/28394.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28394/head:pull/28394 PR: https://git.openjdk.org/jdk/pull/28394 From ayang at openjdk.org Wed Nov 19 16:21:48 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 19 Nov 2025 16:21:48 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages In-Reply-To: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: <4xbcyipwXdImnJlSPxg6wLfggFPAsk8RXxTmPRiATaA=.6ffac8da-5727-488e-9648-9a4d111cf1a4@github.com> On Wed, 19 Nov 2025 15:44:57 GMT, Joel Sikstr?m wrote: > Hello, > > Today, Parallel decides to opt out of using Large pages if the initial heap size does not cover enough Large pages for all spaces. Additionally, if we don't get enough initial heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. By making sure the minimum heap size covers this, we never have to disable Large pages or run in a NUMA-degraded mode based on the setting of the initial heap size. > > For completeness, when user-proided settings for UseNUMA, UseLargePages and InitialHeapSize can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise InitialHeapSize over both UseNUMA and UseLargePages. This change suggest shifting the priority to UseNUMA and UseLargePages, by bumping MinHeapSize to an adequate number. Bumping MinHeapSize directly affects InitialHeapSize, since InitialHeapSize must be equal to or greater than MinHeapSize. > >
    > > Min and Initial heap sizes before/after (expandable section) > > Before changes. We always get Min&Initial 2MB that we request: > > java -XX:+UseParallelGC -Xms2M -Xmx1G > Alignments: Space 512K, Heap 2M > Heap Min Capacity: 2M > Heap Initial Capacity: 2M > > java -XX:+UseParallelGC -XX:+UseLargePages -Xms2M -Xmx1G > MinHeapSize (2097152) must be large enough for 4 * page-size; Disabling UseLargePages for heap > Alignments: Space 512K, Heap 2M > Heap Min Capacity: 2M > Heap Initial Capacity: 2M > > java -XX:+UseParallelGC -XX:+UseNUMA -Xms2M -Xmx1G > Alignments: Space 512K, Heap 2M > Heap Min Capacity: 2M > Heap Initial Capacity: 2M > > java -XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA -Xms2M -Xmx1G > MinHeapSize (2097152) must be large enough for 4 * page-size; Disabling UseLargePages for heap > Alignments: Space 512K, Heap 2M > Heap Min Capacity: 2M > Heap Initial Capacity: 2M > > > After changes. We bump Min, and in turn also Initial, to accommodate enough Large Pages for all spaces. This is run on a NUMA machine with two NUMA nodes, so we get an extra 2MB when NUMA is enabled for the additi... Marked as reviewed by ayang (Reviewer). src/hotspot/share/gc/parallel/parallelArguments.cpp line 110: > 108: size_t ParallelArguments::young_gen_size_lower_bound() { > 109: const size_t num_eden_spaces = UseNUMA > 110: ? os::numa_get_groups_num() I suggest vertically aligning `?` and `:` with `=`. ------------- PR Review: https://git.openjdk.org/jdk/pull/28394#pullrequestreview-3483485893 PR Review Comment: https://git.openjdk.org/jdk/pull/28394#discussion_r2542594153 From iklam at openjdk.org Wed Nov 19 18:09:35 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 19 Nov 2025 18:09:35 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> References: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> Message-ID: On Wed, 19 Nov 2025 15:50:52 GMT, Matias Saavedra Silva wrote: >> src/hotspot/share/ci/ciMethodData.cpp line 557: >> >>> 555: mdo->set_arg_stack(_arg_stack); >>> 556: mdo->set_arg_returned(_arg_returned); >>> 557: int arg_count = mdo->method()->number_of_parameters(); >> >> The actual size allocated seems to be based on the argument size in slots, not count. >> https://github.com/openjdk/jdk/blob/902aa4dcd297fef34cb302e468b030c48665ec84/src/hotspot/share/oops/methodData.cpp#L1359-L1360 >> To avoid any confusion, consider using the limit from ArgInfoData::number_of_args(), which would be better named size_of_args(). > > So it looks like the use of "arg" here refers to "arg slot" meaning these variables and methods could be renamed to be more clear. What do you think of renaming methods like `arg_modified` to `arg_slot_modified`? I think it's OK to rename `arg_count` to `arg_size`. There's quite a lot of existing code that does this. `arg_size` is understood to be the "number of slots". https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/opto/graphKit.cpp#L2365-L2366 https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/ci/bcEscapeAnalyzer.cpp#L192-L196 I am not sure about adding "slot" to "arg_modified". While there are some use of the word "slot" in the compiler APIs, it's not common. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2543024462 From pchilanomate at openjdk.org Wed Nov 19 19:06:34 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 19 Nov 2025 19:06:34 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent [v2] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence would be to place extra overhead on the thread requesting to disable transitions (e.g. by usi ng a safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so I believe this approach is simpler. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and unmount cases, and a... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Add fixes to DumpThreadsWithEliminatedLock.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/70f96a7d..976486cd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=00-01 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Wed Nov 19 19:06:37 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 19 Nov 2025 19:06:37 GMT Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes without JVMTI agent [v2] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 13:50:44 GMT, Alan Bateman wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Add fixes to DumpThreadsWithEliminatedLock.java > > src/java.base/share/classes/java/lang/VirtualThread.java line 1390: > >> 1388: } >> 1389: >> 1390: // -- JVM TI support -- > > We'll need to update is comment as it no longer only for JVMTI. > > This might be a good place for a block comment to define "transitions" covering the changing of thread identity the continuation mount/unmount, and how the notification to the VM support JVMTI and handshakes. Maybe I could contribute a block comment to include here? That would be great. > src/java.base/share/native/libjava/VirtualThread.c line 38: > >> 36: { "startFinalTransition", "()V", (void *)&JVM_VirtualThreadEnd }, >> 37: { "startTransition", "(Z)V", (void *)&JVM_VirtualThreadStartTransition }, >> 38: { "endTransition", "(Z)V", (void *)&JVM_VirtualThreadEndTransition }, > > I wonder if JVM_VirtualThreadStart and JVM_VirtualThreadEnd should be renamed to have EndFirstTransition and StartFinalTransaction in the names so it's easy to follow through from the Java code down to MountUnmountDisabler::start_transition/end_transition. How about removing these methods and just have an extra boolean parameter in `start/endTransition`? https://github.com/pchilano/jdk/compare/JDK-8364343...pchilano:jdk:startEndTransitionsOnly > test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWhenParking.java line 94: > >> 92: }); >> 93: } >> 94: // wait for all virtual threads to start so all have a non-empty stack > > This reminds me the loom repo has a small update to to the DumpThreadsWithEliminatedLock.java test to ensure that the virtual thread starts execution before doing the thread dump. This was noticed with test-repeat runs of the new test to ensure it was stable. Added the fixes to `DumpThreadsWithEliminatedLock.java` from the loom repo. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2543217770 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2543212884 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2543215701 From kbarrett at openjdk.org Wed Nov 19 20:07:55 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 20:07:55 GMT Subject: RFR: 8371923: Update LockFreeStack for Atomic [v2] In-Reply-To: References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: On Tue, 18 Nov 2025 10:48:26 GMT, Ivan Walulya wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into lock-free-stack-allows-new-atomic >> - rename next_access to next_accessor >> - LockFreeStack supports Atomic > > Marked as reviewed by iwalulya (Reviewer). Thanks for reviews @walulyai and @dholmes-ora ------------- PR Comment: https://git.openjdk.org/jdk/pull/28329#issuecomment-3554431204 From kbarrett at openjdk.org Wed Nov 19 20:07:57 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 20:07:57 GMT Subject: Integrated: 8371923: Update LockFreeStack for Atomic In-Reply-To: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> References: <8T9_DwoQj-__PXY0iLOJTyWMbXp3sqSz-CXGxZVbrRk=.f61223e7-fb77-41ae-bb5d-f32244505b5b@github.com> Message-ID: On Fri, 14 Nov 2025 18:35:10 GMT, Kim Barrett wrote: > Please review this change to the `LockFreeStack` utility to allow clients to > use `Atomic` as the type of the "next" member used in the linked-list > representation of the stack. It also continues to allow clients to use the old > (pre-`Atomic`) form where the "next" member is volatile. This allows > clients to be updated incrementally after this change, rather than requiring > all clients to be updated in conjunction with the update of this class. Once > all clients have been updated, support for the old form can be removed. > > The associated gtests have been updated to use `Atomic`, with testing of > the old form is no longer being done. The non-updated uses provide some > testing, and that's all expected to go away soon. So parameterizing the gtests > for both forms seems like a bunch of work that will just be deleted soon, with > very little benefit. > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: 6f1c5733 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/6f1c5733ed4a1d1a1e099681f1f292acf827d9dc Stats: 53 lines in 2 files changed: 21 ins; 0 del; 32 mod 8371923: Update LockFreeStack for Atomic Reviewed-by: iwalulya, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/28329 From kbarrett at openjdk.org Wed Nov 19 20:21:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 20:21:38 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 14:03:32 GMT, Anton Artemov wrote: >> src/hotspot/share/utilities/spinCriticalSection.hpp line 45: >> >>> 43: class SpinCriticalSection { >>> 44: private: >>> 45: volatile int* const _lock; >> >> Use new `Atomic` rather than introducing new direct uses of `AtomicAccess`. > > This will require somewhat extensive changes in JFR, as the are using the same thing for their JfrTryLock. OK, leave it to the "upgrade all the world to use `Atomic`" task with it's many subtasks. (There isn't such a thing in JBS; I've just been linking new RFEs to JDK-8367013.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2543432729 From kbarrett at openjdk.org Wed Nov 19 20:29:22 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 20:29:22 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v6] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 14:04:41 GMT, Anton Artemov wrote: >> src/hotspot/share/utilities/spinCriticalSection.hpp line 58: >> >>> 56: class SpinSingleSection { >>> 57: private: >>> 58: volatile int* const _lock; >> >> Why an `int`-type value for the lock, rather than `bool`? I know why, but it should probably be >> stated explicitly, else someone might be tempted to change it in the future. > > I believe it is for historical reasons? I think it's because `bool` is (typically? always? I don't recall) of size 1, and size 1 cmpxchg isn't supported by a lot of hardware, so we emuluate it for those. Mostly we don't worry about that, but this is supposed to be low-level and performance-critical (else why aren't we just using a mutex?). Or maybe it is historical, and the starting point for some of this code is so old that it predates our emulation of 1-byte cmpxchg for some platforms. :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2543455013 From jrose at openjdk.org Wed Nov 19 20:35:01 2025 From: jrose at openjdk.org (John R Rose) Date: Wed, 19 Nov 2025 20:35:01 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v3] In-Reply-To: References: Message-ID: On Sun, 16 Nov 2025 01:10:40 GMT, Kim Barrett wrote: >> 8369187: Add wrapper for that forbids use of global allocation and deallocation functions >> >> Please review this change that adds `cppstdlib/new.hpp` as a wrapper for >> including ``. All existing inclusions of `` are changed to include >> the new wrapper. >> >> In additional to including ``, this wrapper also provides deprecation >> declarations to prevent the use of some facilities by HotSpot code. >> >> However, those deprecations need to be conditionalized to not apply to gtests, >> so this change also adds a macro definition provided by the build system for >> use in detecting that a header is being included by a gtest. >> >> Testing: mach5 tier1 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into wrap-stdlib-new > - poison implicit alloc/dealloc in globalDefinitions > - Merge branch 'master' into wrap-stdlib-new > - further conditionalize deprecation of hardare interference sizes > - add wrapper for Does what it says on the tin; good. (The stuff about hardware interference sizes seems a distraction, but I accept it?s part of the modern API for dynamic allocation, so away it goes. We can and should make our own platform-level queries as needed for this sort of thing.) ------------- Marked as reviewed by jrose (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28250#pullrequestreview-3484631257 From kbarrett at openjdk.org Wed Nov 19 20:39:09 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 20:39:09 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v7] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 14:09:53 GMT, Anton Artemov wrote: >> Hi, >> >> please consider the following changes: >> >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. >> >> Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. >> >> Tested in tiers 1 - 5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8366671: Addressed reviewer's comments Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/spinCriticalSection.hpp line 30: > 28: #include "runtime/javaThread.hpp" > 29: > 30: class SpinCriticalSectionHelper : AllStatic { SpinCriticalSectionHelper is only used (and only usable by) SpinCriticalSection. There doesn't seem to be any benefit to the separation now. src/hotspot/share/utilities/spinCriticalSection.hpp line 43: > 41: // which employs a spin lock. > 42: // > 43: // We use this critical section _only for low-contention code, and Either s/_only/_only_/ or s/_only/only/ ------------- PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3484643323 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2543485977 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2543489482 From kbarrett at openjdk.org Wed Nov 19 20:59:01 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 20:59:01 GMT Subject: Integrated: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 05:49:27 GMT, Kim Barrett wrote: > 8369187: Add wrapper for that forbids use of global allocation and deallocation functions > > Please review this change that adds `cppstdlib/new.hpp` as a wrapper for > including ``. All existing inclusions of `` are changed to include > the new wrapper. > > In additional to including ``, this wrapper also provides deprecation > declarations to prevent the use of some facilities by HotSpot code. > > However, those deprecations need to be conditionalized to not apply to gtests, > so this change also adds a macro definition provided by the build system for > use in detecting that a header is being included by a gtest. > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: f5bc6ee9 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/f5bc6ee90d73da00cab5cad283b9517c692bc895 Stats: 207 lines in 15 files changed: 187 ins; 20 del; 0 mod 8369187: Add wrapper for that forbids use of global allocation and deallocation functions Reviewed-by: stefank, erikj, jrose ------------- PR: https://git.openjdk.org/jdk/pull/28250 From kbarrett at openjdk.org Wed Nov 19 20:58:58 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 19 Nov 2025 20:58:58 GMT Subject: RFR: 8369187: Add wrapper for that forbids use of global allocation and deallocation functions [v3] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 09:10:46 GMT, Stefan Karlsson wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Merge branch 'master' into wrap-stdlib-new >> - poison implicit alloc/dealloc in globalDefinitions >> - Merge branch 'master' into wrap-stdlib-new >> - further conditionalize deprecation of hardare interference sizes >> - add wrapper for > > The last change seems reasonable to me. Thanks for reviews @stefank , @erikj79 , and @rose00 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28250#issuecomment-3554597814 From skuksenko at openjdk.org Wed Nov 19 21:54:40 2025 From: skuksenko at openjdk.org (Sergey Kuksenko) Date: Wed, 19 Nov 2025 21:54:40 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v2] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 23:35:44 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with two additional commits since the last revision: > > - whitespace > - address first comments What is the reason to add a new microbenchmark? We already have enough micros covering MLDSA: org.openjdk.bench.javax.crypto.full.KeyPairGeneratorBench.MLDSA.generateKeyPair org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA.sign org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA.verify org.openjdk.bench.javax.crypto.small.KeyPairGeneratorBench.MLDSA.generateKeyPair org.openjdk.bench.javax.crypto.small.SignatureBench.MLDSA.sign org.openjdk.bench.javax.crypto.small.SignatureBench.MLDSA.verify ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3554770437 From vpaprotski at openjdk.org Wed Nov 19 22:03:09 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Wed, 19 Nov 2025 22:03:09 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v2] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 21:51:32 GMT, Sergey Kuksenko wrote: > What is the reason to add a new microbenchmark? We already have enough micros covering MLDSA: > > org.openjdk.bench.javax.crypto.full.KeyPairGeneratorBench.MLDSA.generateKeyPair org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA.sign org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA.verify org.openjdk.bench.javax.crypto.small.KeyPairGeneratorBench.MLDSA.generateKeyPair org.openjdk.bench.javax.crypto.small.SignatureBench.MLDSA.sign org.openjdk.bench.javax.crypto.small.SignatureBench.MLDSA.verify I can definitely remove it, got no strong attachment to it.. I did find it useful during development and thought it might be useful during review to verify performance.. but the usefulness of it beyond is indeed debatable. You might notice its a lot more 'granular'; it measures the performance of the intrinsics themselves, not the ("10-level deep") "wrappers". That said, those "wrappers" is what actual user will see and what we should be measuring. This new benchmark is only useful to another intrinsic developer.. (but it should already be usable by other platforms not just Intel?) ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3554793447 From macarte at openjdk.org Wed Nov 19 22:40:54 2025 From: macarte at openjdk.org (Mat Carter) Date: Wed, 19 Nov 2025 22:40:54 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: <8XzXNt3iOeijZtZWB_zdMoWLPadJkgEbmaoqZQjEH1A=.a9fca7c1-9e42-4209-b21f-08af5554d344@github.com> References: <3DZMFG5pUixBip4O18gylfQpcCOTFxcwwVTWahRMBYo=.c9cb089e-6031-4b77-bb4a-775ed6cac818@github.com> <8XzXNt3iOeijZtZWB_zdMoWLPadJkgEbmaoqZQjEH1A=.a9fca7c1-9e42-4209-b21f-08af5554d344@github.com> Message-ID: On Mon, 3 Nov 2025 11:01:56 GMT, Alan Bateman wrote: >> I also removed the nested {@code ..} from within the as that also caused an issue > > Good. You can move the example to a snippet too and that will allow the `
    ` tags to go away.
    
    In the latest CSR the code snippet was removed
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2543782660
    
    From skuksenko at openjdk.org  Wed Nov 19 22:46:00 2025
    From: skuksenko at openjdk.org (Sergey Kuksenko)
    Date: Wed, 19 Nov 2025 22:46:00 GMT
    Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements
     [v2]
    In-Reply-To: 
    References: 
     
     
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 21:59:03 GMT, Volodymyr Paprotski  wrote:
    
    > > What is the reason to add a new microbenchmark? We already have enough micros covering MLDSA:
    > > org.openjdk.bench.javax.crypto.full.KeyPairGeneratorBench.MLDSA.generateKeyPair org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA.sign org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA.verify org.openjdk.bench.javax.crypto.small.KeyPairGeneratorBench.MLDSA.generateKeyPair org.openjdk.bench.javax.crypto.small.SignatureBench.MLDSA.sign org.openjdk.bench.javax.crypto.small.SignatureBench.MLDSA.verify
    > 
    > I can definitely remove it, got no strong attachment to it.. I did find it useful during development and thought it might be useful during review to verify performance.. but the usefulness of it beyond is indeed debatable.
    > 
    > You might notice its a lot more 'granular'; it measures the performance of the intrinsics themselves, not the ("10-level deep") "wrappers". That said, those "wrappers" is what actual user will see and what we should be measuring.
    > 
    > This new benchmark is only useful to another intrinsic developer.. (but it should already be usable by other platforms not just Intel?)
    
    I understand your reasons.  The question is whether you'll need the microbenchmark in the future. If no (or probably no), please remove the micro. 
    If needed, please move it from the "org.openjdk.bench.javax.crypto.full" package to "org.openjdk.bench.javax.crypto". It is supposed to have only public API micros in packages "small" and "full"
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3554914771
    
    From sspitsyn at openjdk.org  Thu Nov 20 02:31:46 2025
    From: sspitsyn at openjdk.org (Serguei Spitsyn)
    Date: Thu, 20 Nov 2025 02:31:46 GMT
    Subject: RFR: 6960970: Debugger very slow during stepping
    Message-ID: 
    
    This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame.
    
    This fix is to avoid enabling the `interp-only` mode for threads when `FramePop` events are enabled with JVMTI `SetEventNotificationMode`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` in the function `InterpreterRuntime::frequency_counter_overflow_inner()`. Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked.
    The other details will be provided in the first PR request comment.
    It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed).
    
    Testing:
     - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage
     - submitted mach5 tiers 1-6
    
    -------------
    
    Commit messages:
     - 6960970: Debugger very slow during stepping
    
    Changes: https://git.openjdk.org/jdk/pull/28407/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=00
      Issue: https://bugs.openjdk.org/browse/JDK-6960970
      Stats: 282 lines in 21 files changed: 200 ins; 57 del; 25 mod
      Patch: https://git.openjdk.org/jdk/pull/28407.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28407/head:pull/28407
    
    PR: https://git.openjdk.org/jdk/pull/28407
    
    From sspitsyn at openjdk.org  Thu Nov 20 04:45:24 2025
    From: sspitsyn at openjdk.org (Serguei Spitsyn)
    Date: Thu, 20 Nov 2025 04:45:24 GMT
    Subject: RFR: 6960970: Debugger very slow during stepping [v2]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame.
    > 
    > This fix is to avoid enabling the `interp-only` mode for threads when `FramePop` events are enabled with JVMTI `SetEventNotificationMode`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` in the function `InterpreterRuntime::frequency_counter_overflow_inner()`. Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked.
    > The other details will be provided in the first PR request comment.
    > It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed).
    > 
    > Testing:
    >  - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage
    >  - submitted mach5 tiers 1-6
    
    Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision:
    
      cleanup: removed an old code fragment in frame.cpp
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28407/files
      - new: https://git.openjdk.org/jdk/pull/28407/files/d31aefb6..b3cffe5a
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=01
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=00-01
    
      Stats: 14 lines in 1 file changed: 0 ins; 14 del; 0 mod
      Patch: https://git.openjdk.org/jdk/pull/28407.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28407/head:pull/28407
    
    PR: https://git.openjdk.org/jdk/pull/28407
    
    From sspitsyn at openjdk.org  Thu Nov 20 04:53:52 2025
    From: sspitsyn at openjdk.org (Serguei Spitsyn)
    Date: Thu, 20 Nov 2025 04:53:52 GMT
    Subject: RFR: 6960970: Debugger very slow during stepping [v2]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 04:45:24 GMT, Serguei Spitsyn  wrote:
    
    >> This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame.
    >> 
    >> This fix is to avoid enabling the `interp-only` mode for threads when `FramePop` events are enabled with JVMTI `SetEventNotificationMode`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` in the function `InterpreterRuntime::frequency_counter_overflow_inner()`. Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked.
    >> The other details will be provided in the first PR request comment.
    >> It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed).
    >> 
    >> Testing:
    >>  - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage
    >>  - submitted mach5 tiers 1-6
    >
    > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   cleanup: removed an old code fragment in frame.cpp
    
    Change details:
     - platform-dependent `interp_masm*.cpp` and `zero/bytecodeInterpreter.cpp`:
        removed `interp-only` mode check before call to `InterpreterRuntime::post_method_exit()`
     - `interpreterRuntime.cpp`: disallow OSR for frames marked for `FramePop` event notification
     - `jvmtiEnvBase.cpp`:
       - added frame deoptimization logic to JVMTI `NotifyFramePop` implementation
       - added `check_and_clear_vthread_pending_deopts()` calls to JVMTI `ClearAllFramePops` implementation
     - `jvmtiEnvThreadState.cpp`: removed undesired check `!jvmti_thread_state()->is_interp_only_mode()` from `JvmtiEnvThreadState::is_frame_pop()` function
     - `jvmtiEventController.cpp`:
        - `JvmtiEventControllerPrivate::recompute_thread_enabled()`: removed `ets->has_frame_pops()` check when the `interp-only` node is enabled
     - `jvmtiExport.?pp`: 
       - introduced new function `has_frame_pop_for_top_frame()` used to disallow `OSR` for target frame
       - `JvmtiExport::post_method_exit()`: replaced `state->is_interp_only_mode()` check with `current_frame.is_interpreted_frame()`
     - `jvmtiThreadState.?pp`:
       - implementation of `process_vthread_pending_deopts()`logics
       - `JvmtiThreadState::process_pending_interp_only()`: added `MutexLocker mu(JvmtiThreadState_lock)`
       - `JvmtiThreadState::update_for_pop_top_frame()`: removed `is_interp_only_mode()` check
     - `frame.cpp`:
        - fix provided by Patricio (Thank you!): an assert specific to continuation heap frames is fixed
     - `serviceability/jvmti/events/NotifyFramePopStressTest`: minor tweak to help with debugging
     - `serviceability/jvmti/vthread/ThreadStateTest`: added code to provide extra test coverage from JVMTI `FramePop` events
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28407#issuecomment-3555778224
    
    From dholmes at openjdk.org  Thu Nov 20 05:13:52 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 05:13:52 GMT
    Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error:
     applying non-zero offset 1073741824 to null pointer [v11]
    In-Reply-To: 
    References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com>
     
     <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com>
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 13:45:11 GMT, Afshin Zafari  wrote:
    
    >> src/hotspot/share/memory/memoryReserver.cpp line 586:
    >> 
    >>> 584:                                           lowest_start, highest_start);
    >>> 585:       reserved = try_reserve_range((char*)highest_start, (char*)lowest_start, attach_point_alignment,
    >>> 586:                                    (char*)aligned_heap_base_min_address, (char*)UnscaledOopHeapMax, size, alignment, page_size);
    >> 
    >> Not obvious to me this actually improves anything - what is it fixing?
    >
    > First, the pointer arithmetics are done on `uintptr_t` types to avoid UB.
    > Second, it is checked that `lowest` and `highest` are still valid after becoming larger or smaller, respectively.
    
    I think it would be more accurate to say we are avoiding actual pointer arithmetic here.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2544355021
    
    From dholmes at openjdk.org  Thu Nov 20 05:18:57 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 05:18:57 GMT
    Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error:
     applying non-zero offset 1073741824 to null pointer [v11]
    In-Reply-To: 
    References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com>
     
     <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com>
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 13:48:41 GMT, Afshin Zafari  wrote:
    
    >> src/hotspot/share/memory/memoryReserver.cpp line 590:
    >> 
    >>> 588: 
    >>> 589:     // zerobased: Attempt to allocate in the lower 32G.
    >>> 590:     size_t zerobased_max = OopEncodingHeapMax;
    >> 
    >> Again not obvious what this improves. We obviously have very inconsistent use of types here in that we loosely use `char*`, `uint64_t` and `size_t` to all mean a 64-bit unsigned value, ansd no matter what types we use in the declarations we have to cast something somewhere.
    >
    > According to reviewers' suggestions, the pointers used in arithmeitc are typed as numeric like `size_t` or `uintptr_t`. And only when they are going to be passed as pointers to other functions, they will be cast to the desired pointers.
    
    Okay but why `size_t` in places and `uintptr_t` in others? In this case `zerobased_max` seems an address not a size - similar to `highest_start` and `lowest_start` in the other part of the change. But then `OopEncodingHeapMax` is `uint64_t` so why not use that?
    
    I'm just not seeing the rules that are being applied here.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2544365753
    
    From dholmes at openjdk.org  Thu Nov 20 06:10:57 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 06:10:57 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v6]
    In-Reply-To: 
    References: 
     
     
     
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 20:26:02 GMT, Kim Barrett  wrote:
    
    >> I believe it is for historical reasons?
    >
    > I think it's because `bool` is (typically? always? I don't recall) of size 1,
    > and size 1 cmpxchg isn't supported by a lot of hardware, so we emuluate it for
    > those. Mostly we don't worry about that, but this is supposed to be low-level
    > and performance-critical (else why aren't we just using a mutex?). Or maybe it
    > is historical, and the starting point for some of this code is so old that it
    > predates our emulation of 1-byte cmpxchg for some platforms. :)
    
    I think it is because a 32-bit atomic op is mostly likely the most performant variant even if larger and smaller entities are supported by an architecture.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2544469222
    
    From dholmes at openjdk.org  Thu Nov 20 06:20:12 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 06:20:12 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v7]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 14:09:53 GMT, Anton Artemov  wrote:
    
    >> Hi, 
    >> 
    >> please consider the following changes:
    >> 
    >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. 
    >> 
    >> Additionally, `SpinSingleSection` class is added, which allows to execute a payload code inside of a lambda function by only one thread. 
    >> 
    >> Tested in tiers 1 - 5.
    >
    > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   8366671: Addressed reviewer's comments
    
    I am happier with this reduced form but as per comments don't change the implementation now please.
    
    Thanks
    
    src/hotspot/share/utilities/spinCriticalSection.cpp line 34:
    
    > 32:   }
    > 33: 
    > 34:   SpinYield sy(4096, 5, 1000000);
    
    This is supposed to be a refactor. If you want to change the implementation then I suggest a separate PR with full performance analysis.
    
    src/hotspot/share/utilities/spinCriticalSection.hpp line 33:
    
    > 31:   friend class SpinCriticalSection;
    > 32:   template
    > 33:   friend class SpinSingleSection;
    
    Shouldn't this be removed?
    
    -------------
    
    Changes requested by dholmes (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3485868831
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2544483791
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2544486968
    
    From dholmes at openjdk.org  Thu Nov 20 06:20:14 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 06:20:14 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v7]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 20:34:18 GMT, Kim Barrett  wrote:
    
    >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    >> 
    >>   8366671: Addressed reviewer's comments
    >
    > src/hotspot/share/utilities/spinCriticalSection.hpp line 30:
    > 
    >> 28: #include "runtime/javaThread.hpp"
    >> 29: 
    >> 30: class SpinCriticalSectionHelper : AllStatic {
    > 
    > SpinCriticalSectionHelper is only used (and only usable by)
    > SpinCriticalSection. There doesn't seem to be any benefit to the separation
    > now.
    
    I agree.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2544485815
    
    From aboldtch at openjdk.org  Thu Nov 20 06:20:45 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 06:20:45 GMT
    Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error:
     applying non-zero offset 1073741824 to null pointer [v11]
    In-Reply-To: 
    References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com>
     
     <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com>
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 05:15:48 GMT, David Holmes  wrote:
    
    >> According to reviewers' suggestions, the pointers used in arithmeitc are typed as numeric like `size_t` or `uintptr_t`. And only when they are going to be passed as pointers to other functions, they will be cast to the desired pointers.
    >
    > Okay but why `size_t` in places and `uintptr_t` in others? In this case `zerobased_max` seems an address not a size - similar to `highest_start` and `lowest_start` in the other part of the change. But then `OopEncodingHeapMax` is `uint64_t` so why not use that?
    > 
    > I'm just not seeing the rules that are being applied here.
    
    Not using `uint64_t` I think was to be pragmatic because it is a different type than `uintptr_t` (on MacOS iirc). One is `unsigned long long` and the other is `unsigned long`. Causes issues with overload resolution for templated functions. (Maybe that was just the issue with the similarly typed `UnscaledOopHeapMax`)
    
    I think `OopEncodingHeapMax` is unfortunately typed. It might be intentional. Because we use it in two ways.
    Either as the `Maximal size of heap`, or as the `zero based address: 0 + OopEncodingHeapMax` (the max end address of the `Maximal size of heap` Heap). In one case the type is more natural to be `size_t` and in the other it is `uintptr_t`. 
    
    Right here though I agree type should be `uintptr_t`. We are using it as the max address our heap can end in.
    
    I would much rather we had 
    ```c++
      const uintptr_t zerobased_max = OopEncodingHeapMax;
    
    
    In my opinion `UnscaledOopHeapMax` `OopEncodingHeapMax` should be typed as size_t, better named (to reflect their compressed oop nature and that they relate to the `Maximal size of heap`) and only be available in 64-bit VM (as using these in a 32-bit VM smells buggy).
    And when we want to use it as the max end address we put it in a `uintptr_t` typed variable.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2544488536
    
    From fyang at openjdk.org  Thu Nov 20 07:46:54 2025
    From: fyang at openjdk.org (Fei Yang)
    Date: Thu, 20 Nov 2025 07:46:54 GMT
    Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v5]
    In-Reply-To: 
    References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com>
     
    Message-ID: 
    
    On Tue, 18 Nov 2025 09:27:44 GMT, Hamlin Li  wrote:
    
    >> Hi,
    >> 
    >> This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`.
    >> 
    >> This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231.
    >> 
    >> Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work.
    >> 
    >> # Test
    >> ## Jtreg
    >> 
    >> in progress...
    >> 
    >> ## Performance
    >> 
    >> Column names meanings:
    >> * p: with patch
    >> * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on
    >> * m: without patch
    >> * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on
    >> 
    >> #### Average improvement
    >> 
    >> NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231.
    >> 
    >> For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv.
    >> 
    >> Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v)
    >> -- | -- | -- | --
    >> 1.022782609 | 2.198717391 | 2.162673913 | 2.199
    >> 
    >> 
    >
    > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   replace assert with log_warning
    
    @Hamlin-Li : Thanks for the update. I am having another look.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28309#issuecomment-3556397859
    
    From jsjolen at openjdk.org  Thu Nov 20 08:36:02 2025
    From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=)
    Date: Thu, 20 Nov 2025 08:36:02 GMT
    Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v18]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > Hi,
    > 
    > This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`.
    > 
    > We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately.
    > 
    > For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc.
    > 
    > On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement.
    > 
    > Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again.
    
    Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision:
    
      IDE doesn't help you with VM structs!
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/27198/files
      - new: https://git.openjdk.org/jdk/pull/27198/files/3686fc34..b6fe0bd7
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=17
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27198&range=16-17
    
      Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
      Patch: https://git.openjdk.org/jdk/pull/27198.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/27198/head:pull/27198
    
    PR: https://git.openjdk.org/jdk/pull/27198
    
    From aboldtch at openjdk.org  Thu Nov 20 08:44:44 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 08:44:44 GMT
    Subject: RFR: 8372241: Add GTestCheckers
    Message-ID: 
    
    Each GTest test case is intended to be able to run on its own (this is the design intent of the frame work).
    
    Hotspot also adds some extra flavours of GTests, those that run with no created VM (`TEST`/`TEST_F`), those that run with a shared created VM (`TEST_VM`/`TEST_VM_F`) and those that run in a private created VM (`TEST_OTHER_VM`/`TEST_VM_ASSERT`/`TEST_VM_ASSERT_MSG`/`TEST_VM_FATAL_ERROR_MSG`/`TEST_VM_CRASH_SIGNAL`).
    
    The way this is implemented is by having the first shared VM test that runs create a VM. But this leads to having all proceeding test also have access to a shared VM, even if they are test that are supposed to be able to run without a created VM.
    
    Combining this with the fact that almost all our automated GTest testing always just run all tests in the same order makes it hard to discover if we have dependencies between tests.
    
    I propose we add three types of GTest invocation tests used to find these incorrect dependencies.
    
    1. A test which runs only our no created VM test, to find if we have any VM dependency.
    2. A test which runs only one test at a time, to see if there are any tests which depend on other test having been run.
    3. A test which shuffles the order of our tests to see if there are any dependencies on the order of tests.
    
    Added 1. and 3. to tier1 as they are just as cheap or cheaper than the normal `GTestWrapper.java`. 2. is only added to our complement test groups, so `tier4` and `hotspot_misc`.  Also added a new test group `:hotspot_validate_gtest` which can be used to more easily run all three of these tests.
    
    3. will have some randomness, so there might be that things start popping up. It is quite easy to add exclude tests from via the filter. But the shuffle dependencies are probably the scariest if we find them, as they might be just a test running bug as the one that is excluded here. But it can also find broken assumptions, or bring things we have missed to light.
    
    These tests and the chain of followup fixes are not of high priority. It has not been able to find anything but bad test assumptions and incorrectly configured tests. So I have no problem letting this PR stay open until after the fork, so these tests can bake in the JDK 27 branch rather than the JDK 26 branch.
    
    -------------
    
    Commit messages:
     - Add GTestCheckers
    
    Changes: https://git.openjdk.org/jdk/pull/28409/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28409&range=00
      Issue: https://bugs.openjdk.org/browse/JDK-8372241
      Stats: 323 lines in 6 files changed: 318 ins; 3 del; 2 mod
      Patch: https://git.openjdk.org/jdk/pull/28409.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28409/head:pull/28409
    
    PR: https://git.openjdk.org/jdk/pull/28409
    
    From aboldtch at openjdk.org  Thu Nov 20 09:34:34 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 09:34:34 GMT
    Subject: RFR: 8372245: gtest globalDefinitions.format_specifiers cannot run
     without VM
    Message-ID: 
    
    globalDefinitions.format_specifiers uses `ResourceMark` which requires a created VM.
    Either we stop using resource allocations, or we run the test in VM.
    
    I propose we let this test simply use `stringStream::base` rather than `stringStream::as_string` which is an already managed string and the stream is in scope for as long as the string is used. The string is guaranteed to be valid as long as we do not write to the stream.
    
    -------------
    
    Depends on: https://git.openjdk.org/jdk/pull/28409
    
    Commit messages:
     - gtest globalDefinitions.format_specifiers cannot run without VM
    
    Changes: https://git.openjdk.org/jdk/pull/28415/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28415&range=00
      Issue: https://bugs.openjdk.org/browse/JDK-8372245
      Stats: 4 lines in 2 files changed: 0 ins; 3 del; 1 mod
      Patch: https://git.openjdk.org/jdk/pull/28415.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28415/head:pull/28415
    
    PR: https://git.openjdk.org/jdk/pull/28415
    
    From aboldtch at openjdk.org  Thu Nov 20 09:39:44 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 09:39:44 GMT
    Subject: RFR: 8372248: gtest istream.coverage depends on istream.basic
    Message-ID: 
    
    These two GTests have a strong dependency that `istream.coverage` is ran after `istream.basic`. This goes against the intended design of GTests. (They should be independent).
    
    As such I propose we merge this into one test.
    
    I kept the two `VERBOSE` variables separate. However I am not sure I understand their purpose.
    Currently changing `VERBOSE_TEST` to `true` will cause the test to fail, (not all cases are covered). And the value have of `VERBOSE_COVERAGE` has no effect. What is observed:
     * `(VERBOSE_TEST: false, VERBOSE_COVERAGE: false) -> Success`
     * `(VERBOSE_TEST: false, VERBOSE_COVERAGE:  true) -> Success`
     * `(VERBOSE_TEST:  true, VERBOSE_COVERAGE: false) -> Failure`
     * `(VERBOSE_TEST:  true, VERBOSE_COVERAGE:  true) -> Failure`
    
    But I kept the original behaviour, just merged into one test case rather than two.
    
    -------------
    
    Depends on: https://git.openjdk.org/jdk/pull/28409
    
    Commit messages:
     - gtest istream.coverage depends on istream.basic
    
    Changes: https://git.openjdk.org/jdk/pull/28418/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28418&range=00
      Issue: https://bugs.openjdk.org/browse/JDK-8372248
      Stats: 8 lines in 2 files changed: 0 ins; 4 del; 4 mod
      Patch: https://git.openjdk.org/jdk/pull/28418.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28418/head:pull/28418
    
    PR: https://git.openjdk.org/jdk/pull/28418
    
    From aartemov at openjdk.org  Thu Nov 20 10:37:21 2025
    From: aartemov at openjdk.org (Anton Artemov)
    Date: Thu, 20 Nov 2025 10:37:21 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v8]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > Hi, 
    > 
    > please consider the following changes:
    > 
    > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. 
    > 
    > Tested in tiers 1 - 5.
    
    Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    
      8366671: Addressed reviewers' comments.
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28264/files
      - new: https://git.openjdk.org/jdk/pull/28264/files/dcc0a9b3..8ad139ab
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=07
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=06-07
    
      Stats: 40 lines in 2 files changed: 21 ins; 13 del; 6 mod
      Patch: https://git.openjdk.org/jdk/pull/28264.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264
    
    PR: https://git.openjdk.org/jdk/pull/28264
    
    From aartemov at openjdk.org  Thu Nov 20 10:37:25 2025
    From: aartemov at openjdk.org (Anton Artemov)
    Date: Thu, 20 Nov 2025 10:37:25 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v7]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 06:13:44 GMT, David Holmes  wrote:
    
    >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    >> 
    >>   8366671: Addressed reviewer's comments
    >
    > src/hotspot/share/utilities/spinCriticalSection.cpp line 34:
    > 
    >> 32:   }
    >> 33: 
    >> 34:   SpinYield sy(4096, 5, 1000000);
    > 
    > This is supposed to be a refactor. If you want to change the implementation then I suggest a separate PR with full performance analysis.
    
    Well, I was not actually aware of `SpinYield` existence before @kimbarrett pointed at it. Agree, let's do it in a separate PR. Rolled back.
    
    > src/hotspot/share/utilities/spinCriticalSection.hpp line 33:
    > 
    >> 31:   friend class SpinCriticalSection;
    >> 32:   template
    >> 33:   friend class SpinSingleSection;
    > 
    > Shouldn't this be removed?
    
    Removed with the whole helper class.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2545364628
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2545365262
    
    From aartemov at openjdk.org  Thu Nov 20 10:37:26 2025
    From: aartemov at openjdk.org (Anton Artemov)
    Date: Thu, 20 Nov 2025 10:37:26 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v7]
    In-Reply-To: 
    References: 
     
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 06:14:51 GMT, David Holmes  wrote:
    
    >> src/hotspot/share/utilities/spinCriticalSection.hpp line 30:
    >> 
    >>> 28: #include "runtime/javaThread.hpp"
    >>> 29: 
    >>> 30: class SpinCriticalSectionHelper : AllStatic {
    >> 
    >> SpinCriticalSectionHelper is only used (and only usable by)
    >> SpinCriticalSection. There doesn't seem to be any benefit to the separation
    >> now.
    >
    > I agree.
    
    Removed in the latest commit.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2545362459
    
    From aartemov at openjdk.org  Thu Nov 20 10:37:29 2025
    From: aartemov at openjdk.org (Anton Artemov)
    Date: Thu, 20 Nov 2025 10:37:29 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v7]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 20:34:56 GMT, Kim Barrett  wrote:
    
    >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    >> 
    >>   8366671: Addressed reviewer's comments
    >
    > src/hotspot/share/utilities/spinCriticalSection.hpp line 43:
    > 
    >> 41: // which employs a spin lock.
    >> 42: //
    >> 43: // We use this critical section _only for low-contention code, and
    > 
    > Either s/_only/_only_/ or s/_only/only/
    
    Fixed.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2545363319
    
    From aartemov at openjdk.org  Thu Nov 20 10:37:29 2025
    From: aartemov at openjdk.org (Anton Artemov)
    Date: Thu, 20 Nov 2025 10:37:29 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v6]
    In-Reply-To: 
    References: 
     
     
     
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 06:08:03 GMT, David Holmes  wrote:
    
    >> I think it's because `bool` is (typically? always? I don't recall) of size 1,
    >> and size 1 cmpxchg isn't supported by a lot of hardware, so we emuluate it for
    >> those. Mostly we don't worry about that, but this is supposed to be low-level
    >> and performance-critical (else why aren't we just using a mutex?). Or maybe it
    >> is historical, and the starting point for some of this code is so old that it
    >> predates our emulation of 1-byte cmpxchg for some platforms. :)
    >
    > I think it is because a 32-bit atomic op is mostly likely the most performant variant even if larger and smaller entities are supported by an architecture.
    
    I added a comment about that.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2545361857
    
    From fyang at openjdk.org  Thu Nov 20 11:19:33 2025
    From: fyang at openjdk.org (Fei Yang)
    Date: Thu, 20 Nov 2025 11:19:33 GMT
    Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC
     [v3]
    In-Reply-To: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com>
    References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com>
    Message-ID: 
    
    > Hi, please consider this riscv-specific change.
    > 
    > I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63:
    > 
    > `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)`
    > 
    > The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels.
    > I think these warning messages could be confusing to people. It might be more reasonable to just log these messages.
    > This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case.
    > 
    > After this change, the log on BPI-F3 SBC looks like:
    > 
    > $ java -Xlog:all -version
    > 
    > ......
    > [0.011s][info][os         ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals.
    > [0.011s][info][os         ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV.
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "a"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "c"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "d"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "f"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "i"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "m"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "Zba"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "Zbb"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "Zbs"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "Zfh"
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "Zfhmin"
    > [0.011s][info][os,cpu     ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled))
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "marchid" (-9223372035378380799)
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "mimpid" (1152921505839391232)
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "mvendorid" (1808)
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "satp_mode" (39)
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "unaligned_scalar" (3)
    > [0.011s][info][os,cpu     ] Enabled RV64 feature "zicboz_block_size" (64)
    > [0.011s][info][os,cpu     ] Zifencei not found, required by Linux, enabling.
    > [0.012s][info][os,cpu     ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin
    > ......
    
    Fei Yang has updated the pull request incrementally with one additional commit since the last revision:
    
      Review
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28340/files
      - new: https://git.openjdk.org/jdk/pull/28340/files/5c7c1255..32372096
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28340&range=02
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28340&range=01-02
    
      Stats: 6 lines in 2 files changed: 0 ins; 6 del; 0 mod
      Patch: https://git.openjdk.org/jdk/pull/28340.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28340/head:pull/28340
    
    PR: https://git.openjdk.org/jdk/pull/28340
    
    From fyang at openjdk.org  Thu Nov 20 11:19:33 2025
    From: fyang at openjdk.org (Fei Yang)
    Date: Thu, 20 Nov 2025 11:19:33 GMT
    Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC
     [v3]
    In-Reply-To: 
    References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com>
     <-qb8rWPJSXl2dI-tbY1W35-w-cj18RFmYYyC5PaPdF4=.90c7c9fc-5d5d-4fae-b0ee-ea9ce487e50f@github.com>
     
     
    Message-ID: <8DNRKbDSeKDG0_b1Mle_S9S03JPRBuxtCKR32I1RKMM=.775650fa-cb6c-4118-addf-16a9bcaff5e1@github.com>
    
    On Wed, 19 Nov 2025 09:47:11 GMT, Hamlin Li  wrote:
    
    >> No. But like `virtual void log_enabled() = 0;`, `log_disabled` is also a pure virtual function in the base class. So it has to be implemented in all the subclasses.
    >
    > As `log_disabled` is only called in `UPDATE_DEFAULT_DEP`, which is only called in subclasses of `RVExtFeatureValue`, so I think it's OK to put `log_disabled` under `RVExtFeatureValue`?
    
    That also works for me. Check the latest version.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28340#discussion_r2545563008
    
    From dholmes at openjdk.org  Thu Nov 20 11:52:56 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 11:52:56 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v8]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 10:37:21 GMT, Anton Artemov  wrote:
    
    >> Hi, 
    >> 
    >> please consider the following changes:
    >> 
    >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. 
    >> 
    >> Tested in tiers 1 - 5.
    >
    > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   8366671: Addressed reviewers' comments.
    
    I think this is okay now. Thanks
    
    For some reason git now considers the new file to be a rename of the old file - which messes up the view.
    
    -------------
    
    Marked as reviewed by dholmes (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3487403904
    
    From dfuchs at openjdk.org  Thu Nov 20 12:26:21 2025
    From: dfuchs at openjdk.org (Daniel Fuchs)
    Date: Thu, 20 Nov 2025 12:26:21 GMT
    Subject: RFR: 8372198: Avoid closing PlainHttpConnection while holding a
     lock [v2]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > An experimental change to SelectorManager::shutdown unveil a potential deadlock between the SelectorManager thread trying to stop the HttpClientImpl, and an executor thread concurrently trying to return a connection to the pool.
    > 
    > The issue seems to be caused by the ConnectionPool::returnToPool trying to close the returned connection when stopping, while holding the ConnectionPool state lock, and the SelectorManager thread trying to close a pooled connection, holding the connection stateLock and trying to close the channel, which caused the CleanupTrigger to fire and attempt to remove the connection from the pool.
    > 
    > This problem was observed once with the java/net/httpclient/ThrowingSubscribersAsLimitingAsync.java test.
    > 
    > To avoid the problem, the proposed fix is to wait until the ConnectionPool::stateLock has been released before actually closing the connection, and to wait until the PlainHttpConnection::stateLock has been released before actually closing the channel. Indeed, there should be no need to close those while holding the lock.
    
    Daniel Fuchs has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:
    
     - Merge branch 'SelectorManagerVT-8372159' into ConnectionCloseLock-8372198
     - 8372198: Avoid closing PlainHttpConnection while holding a lock
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28421/files
      - new: https://git.openjdk.org/jdk/pull/28421/files/177e7ee3..97ce3737
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28421&range=01
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28421&range=00-01
    
      Stats: 19751 lines in 348 files changed: 10987 ins; 5712 del; 3052 mod
      Patch: https://git.openjdk.org/jdk/pull/28421.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28421/head:pull/28421
    
    PR: https://git.openjdk.org/jdk/pull/28421
    
    From dfuchs at openjdk.org  Thu Nov 20 12:26:22 2025
    From: dfuchs at openjdk.org (Daniel Fuchs)
    Date: Thu, 20 Nov 2025 12:26:22 GMT
    Subject: RFR: 8372198: Avoid closing PlainHttpConnection while holding a
     lock
    In-Reply-To: 
    References: 
    Message-ID: 
    
    On Thu, 20 Nov 2025 10:38:08 GMT, Daniel Fuchs  wrote:
    
    > An experimental change to SelectorManager::shutdown unveil a potential deadlock between the SelectorManager thread trying to stop the HttpClientImpl, and an executor thread concurrently trying to return a connection to the pool.
    > 
    > The issue seems to be caused by the ConnectionPool::returnToPool trying to close the returned connection when stopping, while holding the ConnectionPool state lock, and the SelectorManager thread trying to close a pooled connection, holding the connection stateLock and trying to close the channel, which caused the CleanupTrigger to fire and attempt to remove the connection from the pool.
    > 
    > This problem was observed once with the java/net/httpclient/ThrowingSubscribersAsLimitingAsync.java test.
    > 
    > To avoid the problem, the proposed fix is to wait until the ConnectionPool::stateLock has been released before actually closing the connection, and to wait until the PlainHttpConnection::stateLock has been released before actually closing the channel. Indeed, there should be no need to close those while holding the lock.
    
    Something went wrong when I tried to merge the main PR branch in the dependent PR branch. I'm going to withdraw this PR and start again.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28421#issuecomment-3557752299
    
    From dfuchs at openjdk.org  Thu Nov 20 12:26:23 2025
    From: dfuchs at openjdk.org (Daniel Fuchs)
    Date: Thu, 20 Nov 2025 12:26:23 GMT
    Subject: Withdrawn: 8372198: Avoid closing PlainHttpConnection while holding a
     lock
    In-Reply-To: 
    References: 
    Message-ID: 
    
    On Thu, 20 Nov 2025 10:38:08 GMT, Daniel Fuchs  wrote:
    
    > An experimental change to SelectorManager::shutdown unveil a potential deadlock between the SelectorManager thread trying to stop the HttpClientImpl, and an executor thread concurrently trying to return a connection to the pool.
    > 
    > The issue seems to be caused by the ConnectionPool::returnToPool trying to close the returned connection when stopping, while holding the ConnectionPool state lock, and the SelectorManager thread trying to close a pooled connection, holding the connection stateLock and trying to close the channel, which caused the CleanupTrigger to fire and attempt to remove the connection from the pool.
    > 
    > This problem was observed once with the java/net/httpclient/ThrowingSubscribersAsLimitingAsync.java test.
    > 
    > To avoid the problem, the proposed fix is to wait until the ConnectionPool::stateLock has been released before actually closing the connection, and to wait until the PlainHttpConnection::stateLock has been released before actually closing the channel. Indeed, there should be no need to close those while holding the lock.
    
    This pull request has been closed without being integrated.
    
    -------------
    
    PR: https://git.openjdk.org/jdk/pull/28421
    
    From aboldtch at openjdk.org  Thu Nov 20 13:06:22 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 13:06:22 GMT
    Subject: RFR: 8372241: Add GTestCheckers [v2]
    In-Reply-To: 
    References: 
    Message-ID: <_mFc-ewMwLN4eQ6fHLj7zXfZHG0EQwR1KPyNNY2keGE=.9ac274c8-6e78-4c26-885f-7acd61283ef8@github.com>
    
    > Each GTest test case is intended to be able to run on its own (this is the design intent of the frame work).
    > 
    > Hotspot also adds some extra flavours of GTests, those that run with no created VM (`TEST`/`TEST_F`), those that run with a shared created VM (`TEST_VM`/`TEST_VM_F`) and those that run in a private created VM (`TEST_OTHER_VM`/`TEST_VM_ASSERT`/`TEST_VM_ASSERT_MSG`/`TEST_VM_FATAL_ERROR_MSG`/`TEST_VM_CRASH_SIGNAL`).
    > 
    > The way this is implemented is by having the first shared VM test that runs create a VM. But this leads to having all proceeding test also have access to a shared VM, even if they are test that are supposed to be able to run without a created VM.
    > 
    > Combining this with the fact that almost all our automated GTest testing always just run all tests in the same order makes it hard to discover if we have dependencies between tests.
    > 
    > I propose we add three types of GTest invocation tests used to find these incorrect dependencies.
    > 
    > 1. A test which runs only our no created VM test, to find if we have any VM dependency.
    > 2. A test which runs only one test at a time, to see if there are any tests which depend on other test having been run.
    > 3. A test which shuffles the order of our tests to see if there are any dependencies on the order of tests.
    > 
    > Added 1. and 3. to tier1 as they are just as cheap or cheaper than the normal `GTestWrapper.java`. 2. is only added to our complement test groups, so `tier4` and `hotspot_misc`.  Also added a new test group `:hotspot_validate_gtest` which can be used to more easily run all three of these tests.
    > 
    > 3. will have some randomness, so there might be that things start popping up. It is quite easy to add exclude tests from via the filter. But the shuffle dependencies are probably the scariest if we find them, as they might be just a test running bug as the one that is excluded here. But it can also find broken assumptions, or bring things we have missed to light.
    > 
    > These tests and the chain of followup fixes are not of high priority. It has not been able to find anything but bad test assumptions and incorrectly configured tests. So I have no problem letting this PR stay open until after the fork, so these tests can bake in the JDK 27 branch rather than the JDK 26 branch.
    
    Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision:
    
      Add comments with the associated bug ids for TEST_FILTERS
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28409/files
      - new: https://git.openjdk.org/jdk/pull/28409/files/2c22648e..83042978
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28409&range=01
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28409&range=00-01
    
      Stats: 8 lines in 3 files changed: 8 ins; 0 del; 0 mod
      Patch: https://git.openjdk.org/jdk/pull/28409.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28409/head:pull/28409
    
    PR: https://git.openjdk.org/jdk/pull/28409
    
    From aboldtch at openjdk.org  Thu Nov 20 13:17:25 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 13:17:25 GMT
    Subject: RFR: 8372245: GTest globalDefinitions.format_specifiers cannot run
     without VM [v2]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > globalDefinitions.format_specifiers uses `ResourceMark` which requires a created VM.
    > Either we stop using resource allocations, or we run the test in VM.
    > 
    > I propose we let this test simply use `stringStream::base` rather than `stringStream::as_string` which is an already managed string and the stream is in scope for as long as the string is used. The string is guaranteed to be valid as long as we do not write to the stream.
    
    Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
    
     - Merge branch 'JDK-8372241' into JDK-8372245
     - gtest globalDefinitions.format_specifiers cannot run without VM
    
    -------------
    
    Changes: https://git.openjdk.org/jdk/pull/28415/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28415&range=01
      Stats: 5 lines in 2 files changed: 0 ins; 4 del; 1 mod
      Patch: https://git.openjdk.org/jdk/pull/28415.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28415/head:pull/28415
    
    PR: https://git.openjdk.org/jdk/pull/28415
    
    From aboldtch at openjdk.org  Thu Nov 20 13:17:29 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 13:17:29 GMT
    Subject: RFR: 8372248: GTest istream.coverage depends on istream.basic [v2]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > These two GTests have a strong dependency that `istream.coverage` is ran after `istream.basic`. This goes against the intended design of GTests. (They should be independent).
    > 
    > As such I propose we merge this into one test.
    > 
    > I kept the two `VERBOSE` variables separate. However I am not sure I understand their purpose.
    > Currently changing `VERBOSE_TEST` to `true` will cause the test to fail, (not all cases are covered). And the value have of `VERBOSE_COVERAGE` has no effect. What is observed:
    >  * `(VERBOSE_TEST: false, VERBOSE_COVERAGE: false) -> Success`
    >  * `(VERBOSE_TEST: false, VERBOSE_COVERAGE:  true) -> Success`
    >  * `(VERBOSE_TEST:  true, VERBOSE_COVERAGE: false) -> Failure`
    >  * `(VERBOSE_TEST:  true, VERBOSE_COVERAGE:  true) -> Failure`
    > 
    > But I kept the original behaviour, just merged into one test case rather than two.
    
    Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
    
     - Merge branch 'JDK-8372241' into JDK-8372248
     - gtest istream.coverage depends on istream.basic
    
    -------------
    
    Changes: https://git.openjdk.org/jdk/pull/28418/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28418&range=01
      Stats: 9 lines in 2 files changed: 0 ins; 5 del; 4 mod
      Patch: https://git.openjdk.org/jdk/pull/28418.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28418/head:pull/28418
    
    PR: https://git.openjdk.org/jdk/pull/28418
    
    From cnorrbin at openjdk.org  Thu Nov 20 13:21:07 2025
    From: cnorrbin at openjdk.org (Casper Norrbin)
    Date: Thu, 20 Nov 2025 13:21:07 GMT
    Subject: RFR: 8367319: Add os interfaces to get machine and container
     values separately [v4]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > Hi everyone,
    > 
    > The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples:
    > 
    > - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different.
    > - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number.
    > 
    > To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values.
    > 
    > In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment.
    > 
    > `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`.
    > 
    > Testing:
    > - Oracle tiers 1-5
    > - Container tests on cgroup v1 and v2 hosts.
    
    Casper Norrbin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:
    
     - Merge branch 'master' into separate-container-machine-values
     - Merge branch 'master' into separate-container-machine-values
     - Move methods to Machine/Container inner classes + clarifying documentation
     - Merge branch 'master' into separate-container-machine-values
     - Fixed print type
     - separate-machine-container-functions
    
    -------------
    
    Changes: https://git.openjdk.org/jdk/pull/27646/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27646&range=03
      Stats: 392 lines in 20 files changed: 249 ins; 54 del; 89 mod
      Patch: https://git.openjdk.org/jdk/pull/27646.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/27646/head:pull/27646
    
    PR: https://git.openjdk.org/jdk/pull/27646
    
    From cnorrbin at openjdk.org  Thu Nov 20 13:21:10 2025
    From: cnorrbin at openjdk.org (Casper Norrbin)
    Date: Thu, 20 Nov 2025 13:21:10 GMT
    Subject: RFR: 8367319: Add os interfaces to get machine and container
     values separately [v3]
    In-Reply-To: <90BsIFGnC7wfP7mO7kOcDArByL17pNbTokjZiTs_7qQ=.e67dbb82-faf4-4364-9301-67e1e2344eb0@github.com>
    References: 
     <90BsIFGnC7wfP7mO7kOcDArByL17pNbTokjZiTs_7qQ=.e67dbb82-faf4-4364-9301-67e1e2344eb0@github.com>
    Message-ID: 
    
    On Wed, 12 Nov 2025 09:50:37 GMT, Casper Norrbin  wrote:
    
    >> Hi everyone,
    >> 
    >> The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples:
    >> 
    >> - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different.
    >> - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number.
    >> 
    >> To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values.
    >> 
    >> In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment.
    >> 
    >> `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`.
    >> 
    >> Testing:
    >> - Oracle tiers 1-5
    >> - Container tests on cgroup v1 and v2 hosts.
    >
    > Casper Norrbin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits:
    > 
    >  - Move methods to Machine/Container inner classes + clarifying documentation
    >  - Merge branch 'master' into separate-container-machine-values
    >  - Fixed print type
    >  - separate-machine-container-functions
    
    Something happened with the merge, sorry for the labels.
    
    Sorry for the spam, something must have happened to the original merge while I was solving all the conflicts. Should all be fixed now.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/27646#issuecomment-3557915723
    PR Comment: https://git.openjdk.org/jdk/pull/27646#issuecomment-3558019594
    
    From mbaesken at openjdk.org  Thu Nov 20 13:22:26 2025
    From: mbaesken at openjdk.org (Matthias Baesken)
    Date: Thu, 20 Nov 2025 13:22:26 GMT
    Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Wed, 12 Nov 2025 15:46:09 GMT, Matthias Baesken  wrote:
    
    >> Identical code folding can reduce the size of some libs, especially libjvm. However not all linkers support the flag/feature so we have to limit it to some environments.
    >> See for example this manpage :
    >> https://manpages.debian.org/testing/lld-7/ld.lld-7.1
    >> 
    >> 
    >> sizes of libjvm.so with / without -icf=all
    >> linux aarch64 : 25888 / 27112 K
    >> linux x86_64 : 27952 / 29072 K
    >> 
    >> 
    >> (for most other native libs the identical code folding has no effect, because there is nothing to fold)
    >
    > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   Limit icf to release builds
    
    Any comments / revies from HS developers ? 
    @dholmes-ora , any comments on this one ?
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28236#issuecomment-3558018238
    
    From eastigeevich at openjdk.org  Thu Nov 20 14:58:07 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 14:58:07 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on
     GenZGC performance
    Message-ID: 
    
    Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
     
    Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    - Disable coherent icache.
    - Trap IC IVAU instructions.
    - Execute:
       - `tlbi vae3is, xzr`
       - `dsb sy`
     
     `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
     
    As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    
    "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    
    This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    
    Changes include:
    
    * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    
    Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    
    - Baseline
    
    $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGCThreads=1 -jar benchmarks.jar org.openjdk.bench.vm.gc.GCPatchingNmethodCost
    
    Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt     Score     Error  Units
    GCPatchingNmethodCost.fullGC                       0           5000  avgt    3    73.937 ?  17.764  ms/op
    GCPatchingNmethodCost.fullGC                       2           5000  avgt    3   648.331 ?  85.773  ms/op
    GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  1221.186 ?  72.401  ms/op
    GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  2336.644 ? 446.816  ms/op
    GCPatchingNmethodCost.systemGC                     0           5000  avgt    3    77.495 ?  11.963  ms/op
    GCPatchingNmethodCost.systemGC                     2           5000  avgt    3   662.447 ? 231.244  ms/op
    GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  1217.174 ? 232.325  ms/op
    GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  2339.458 ? 271.820  ms/op
    GCPatchingNmethodCost.youngGC                      0           5000  avgt    3     9.955 ?   1.649  ms/op
    GCPatchingNmethodCost.youngGC                      2           5000  avgt    3   163.623 ?  42.342  ms/op
    GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   318.399 ?  87.674  ms/op
    GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   618.169 ? 191.474  ms/op
    
    
    - Fix
    
    $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:+NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGCThreads=1 -jar benchmarks.jar org.openjdk.bench.vm.gc.GCPatchingNmethodCost
    
    Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt    Score    Error  Units
    GCPatchingNmethodCost.fullGC                       0           5000  avgt    3   88.865 ? 19.299  ms/op
    GCPatchingNmethodCost.fullGC                       2           5000  avgt    3  146.184 ? 11.531  ms/op
    GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  186.429 ? 16.257  ms/op
    GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  262.933 ? 13.071  ms/op
    GCPatchingNmethodCost.systemGC                     0           5000  avgt    3   90.572 ? 14.750  ms/op
    GCPatchingNmethodCost.systemGC                     2           5000  avgt    3  148.335 ? 21.456  ms/op
    GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  190.828 ? 12.268  ms/op
    GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  265.768 ? 46.669  ms/op
    GCPatchingNmethodCost.youngGC                      0           5000  avgt    3   10.219 ?  0.877  ms/op
    GCPatchingNmethodCost.youngGC                      2           5000  avgt    3   19.035 ?  2.699  ms/op
    GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   26.005 ?  2.179  ms/op
    GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   42.322 ? 85.691  ms/op
    
    -------------
    
    Commit messages:
     - Merge branch 'master' into JDK-8370947
     - Add deferred icache invalidation to all places; Add JMH microbench
     - 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance
    
    Changes: https://git.openjdk.org/jdk/pull/28328/files
      Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=00
      Issue: https://bugs.openjdk.org/browse/JDK-8370947
      Stats: 380 lines in 17 files changed: 340 ins; 4 del; 36 mod
      Patch: https://git.openjdk.org/jdk/pull/28328.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328
    
    PR: https://git.openjdk.org/jdk/pull/28328
    
    From eastigeevich at openjdk.org  Thu Nov 20 14:58:07 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 14:58:07 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: 
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    Based on https://github.com/openjdk/jdk/compare/master...xmas92:jdk:deferred_icache_invalidation
    
    Hi @fisk @theRealAph @xmas92 @shipilev 
    
    I created this draft PR based on @xmas92 work https://github.com/openjdk/jdk/compare/master...xmas92:jdk:deferred_icache_invalidation
    
    Alex wrote about his implementation in [JDK-8370947](https://bugs.openjdk.org/browse/JDK-8370947):
    
    > The implementation I linked is very aarch64 centric. I would like to create a bit nicer abstraction for this to allow easier adaption for other platforms. 
    
    I see his changes touch other backends. I tried to minimize changes and to avoid them in other backends.
    I don't think the concept of deferred icache invalidation will be used anywhere but for Neoverse-N1 errata.
    
    This PR does not cover all cases in ZGC at the moment. It can be done as soon as we agree with a proper way to fix.
    
    I'd like to hear your opinion which way we should choose:
    - Abstraction of deferred icache invalidation supported in all backends.
    - Concrete implementation focused on Neoverse-N1.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3533986189
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3534111731
    
    From aboldtch at openjdk.org  Thu Nov 20 15:04:10 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 15:04:10 GMT
    Subject: RFR: 8371346: ZGC: Flexible heap base selection [v2]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > ZGC reserves a virtual address range for its heap with one high order bit set which is referred to as the heap base. Internally we then often represent heap addresses as offset from this heap base.
    > 
    > Currently we select one specific heap base at the start based on MaxHeapSize and the current system properties.
    > 
    > With instrumented builds, or custom launchers it may be that we are unable to reserve a usable address range using that heap base. Currently we just give up if this happens and exits the VM.
    > 
    > This is problematic when using instrumented builds such as ASAN where there are certain address ranges it uses which often clash with the default ZGC heap base.
    > 
    > I propose that we are more flexible when selecting the heap base, and we start as we do today at our preferred location, but are able to retry other compatible heap bases within some broader limits.
    > 
    > The implementation will now start at the recommended or required heap base which ever is larger and try to first reserve the desired reservation size (normally 16 * MaxHeapSize). If no heap base can accommodate this desired size, it will attempt to find at least the required size and use that.
    > 
    > On linux x86_64 we will now also probe for the heap base rather than hard coding the max heap base as we did previously. This is beneficial when there are address space restrictions (such as with ASAN), and when there are none, we only do a couple of extra system calls at most. 
    > 
    > There are some changes to the gc+init logging. The ZAddressOffsetMax is adjusted to always be a correct upper bound. And the exit path when reservation fails is clean up, so that we exit early when we know that the external virtual memory limits will prohibit the heap reservation. 
    > 
    > Performance testing show no significant differences.
    > 
    > Testing:
    > * GHA
    > * Running ZGC tier1-8 on Oracle supported platforms
    
    Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision:
    
     - Small fixes
     - Merge remote-tracking branch 'upstream_jdk/master' into stefank_review_pr_28161
     - Fixes and cleanups
     - pr/28161_review
     - Merge remote-tracking branch 'upstream/master' into pr/28161
     - Initial Test Implementation
     - Initial implementation flexible heap base
     - Constrain ZAddressOffsetMax correctly when multi-partition fails
     - Log reserved size correctly when multi-partition fails
     - Cleanup headers
     - ... and 1 more: https://git.openjdk.org/jdk/compare/168f0580...1d7b2374
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28161/files
      - new: https://git.openjdk.org/jdk/pull/28161/files/bf5f9550..1d7b2374
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28161&range=01
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28161&range=00-01
    
      Stats: 263285 lines in 2103 files changed: 168784 ins; 56747 del; 37754 mod
      Patch: https://git.openjdk.org/jdk/pull/28161.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28161/head:pull/28161
    
    PR: https://git.openjdk.org/jdk/pull/28161
    
    From aboldtch at openjdk.org  Thu Nov 20 15:08:29 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 15:08:29 GMT
    Subject: RFR: 8371346: ZGC: Flexible heap base selection [v2]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 15:04:10 GMT, Axel Boldt-Christmas  wrote:
    
    >> ZGC reserves a virtual address range for its heap with one high order bit set which is referred to as the heap base. Internally we then often represent heap addresses as offset from this heap base.
    >> 
    >> Currently we select one specific heap base at the start based on MaxHeapSize and the current system properties.
    >> 
    >> With instrumented builds, or custom launchers it may be that we are unable to reserve a usable address range using that heap base. Currently we just give up if this happens and exits the VM.
    >> 
    >> This is problematic when using instrumented builds such as ASAN where there are certain address ranges it uses which often clash with the default ZGC heap base.
    >> 
    >> I propose that we are more flexible when selecting the heap base, and we start as we do today at our preferred location, but are able to retry other compatible heap bases within some broader limits.
    >> 
    >> The implementation will now start at the recommended or required heap base which ever is larger and try to first reserve the desired reservation size (normally 16 * MaxHeapSize). If no heap base can accommodate this desired size, it will attempt to find at least the required size and use that.
    >> 
    >> On linux x86_64 we will now also probe for the heap base rather than hard coding the max heap base as we did previously. This is beneficial when there are address space restrictions (such as with ASAN), and when there are none, we only do a couple of extra system calls at most. 
    >> 
    >> There are some changes to the gc+init logging. The ZAddressOffsetMax is adjusted to always be a correct upper bound. And the exit path when reservation fails is clean up, so that we exit early when we know that the external virtual memory limits will prohibit the heap reservation. 
    >> 
    >> Performance testing show no significant differences.
    >> 
    >> Testing:
    >> * GHA
    >> * Running ZGC tier1-8 on Oracle supported platforms
    >
    > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision:
    > 
    >  - Small fixes
    >  - Merge remote-tracking branch 'upstream_jdk/master' into stefank_review_pr_28161
    >  - Fixes and cleanups
    >  - pr/28161_review
    >  - Merge remote-tracking branch 'upstream/master' into pr/28161
    >  - Initial Test Implementation
    >  - Initial implementation flexible heap base
    >  - Constrain ZAddressOffsetMax correctly when multi-partition fails
    >  - Log reserved size correctly when multi-partition fails
    >  - Cleanup headers
    >  - ... and 1 more: https://git.openjdk.org/jdk/compare/3d39bb21...1d7b2374
    
    We redesigned a bit how the `ZVirtualMemoryManager` works. We no longer update global state while trying to find a reservation. But have created a distinct separation between when we are looking for a heap and when we have found a heap.
    
    Prior to finding a heap we do not use our heap based types `zaddress`, `zpointer`, `zoffset` etc. And if we did they would assert. Then at the point when we are satisfied with a specific heap selection, we set up our global state based on this heap selection, once and only once.
    
    We then start transferring our heap selection into our heap data structures which use these heap based types.
    
    As a consequence we had to rewrite some of the test.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28161#issuecomment-3558539681
    
    From tschatzl at openjdk.org  Thu Nov 20 15:08:56 2025
    From: tschatzl at openjdk.org (Thomas Schatzl)
    Date: Thu, 20 Nov 2025 15:08:56 GMT
    Subject: RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods
     [v10]
    In-Reply-To: <1FJGEob7g_zkJaboRxyS0hPJ1c590V6zZipDrZ3XpOw=.08c1cf33-af6b-49de-b94d-d4f2ca8f5a22@github.com>
    References: 
     <1FJGEob7g_zkJaboRxyS0hPJ1c590V6zZipDrZ3XpOw=.08c1cf33-af6b-49de-b94d-d4f2ca8f5a22@github.com>
    Message-ID: 
    
    On Fri, 17 Oct 2025 05:19:03 GMT, Monica Beckwith  wrote:
    
    >> **Implements:** https://bugs.openjdk.org/browse/JDK-8357445
    >> 
    >> Implement time-based heap uncommit for G1 during idle periods.
    >> 
    >> Key changes:
    >> - Added G1HeapEvaluationTask for periodic heap evaluation
    >> - Switch from G1ServiceTask to PeriodicTask for improved scheduling
    >> - Implemented time-based heap sizing policy with configurable uncommit delay
    >> - Added region activity tracking with last access timestamps
    >> - Integrated VM_G1ShrinkHeap operation for safe heap shrinking
    >> - Added new G1 flags: G1UseTimeBasedHeapSizing, G1TimeBasedEvaluationIntervalMillis, G1UncommitDelayMillis, G1MinRegionsToUncommit
    >> - Added 'sizing' log tag for heap sizing operations
    >> 
    >> Comprehensive Test Coverage:
    >> - Enhanced TestG1RegionUncommit: minimum heap boundaries, concurrent allocation/uncommit scenarios
    >> - Enhanced TestTimeBasedHeapSizing: humongous object handling, rapid allocation cycles, edge cases
    >> - Enhanced TestTimeBasedRegionTracking: concurrent region access, lifecycle transition validation
    >> - Enhanced TestTimeBasedHeapConfig: parameter boundary values, small heap configurations
    >> 
    >> This ensures time-based heap uncommit works correctly while maintaining all safety guarantees and test expectations.
    >
    > Monica Beckwith has updated the pull request incrementally with two additional commits since the last revision:
    > 
    >  - Fix compilation errors after master merge
    >    
    >    - Fix syntax error in g1CollectedHeap.cpp (extra closing brace)
    >    - Update API call from short_term_pause_time_ratio() to short_term_gc_time_ratio()
    >    
    >    These fixes resolve compilation issues that occurred after merging with
    >    upstream master due to API changes in the OpenJDK codebase.
    >  - 8357445: Address feedback for G1 time-based heap sizing
    >    
    >    - Fix indentation in log_debug statement in shrink_helper (suggested by @tschatzl)
    >    - Change terminology from 'inactive' to 'idle' throughout time-based heap sizing
    >    - Update flag descriptions in g1_globals.hpp to use 'idle' terminology
    >    - Fix remaining trailing whitespace in test files
    >    
    >    Addresses all outstanding review comments from @tschatzl
    
    I stopped reviewing after >40 issues (still finding some while writing) because
    - some of them were unresolved from last time. Please resolve comments that have been addressed or comment why the change did not address it. Having old unresolved comments are annoying for re-reviews because they make the reviewer unsure if there is some need for further clarification or not.
    - new code introduced the same issues again (e.g. use of `is_empty()`) without comment on why it can be used, after giving the reason why it can't be used.
    - the main issue that the time-based uncommit still interferes with gc based sizing without restriction still applies.
    - another is that the VM operation still does not re-evaluate the decisions, just uncommits a set number of regions. Possibly not even those that were found as candidates earlier afaict.
    - the only improvement I remember has been that `ServiceTask` is used now
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 830:
    
    > 828: 
    > 829:   // Note: Region timestamps are updated automatically when regions transition to free state
    > 830:   // via set_free() calls, so no blanket reset is needed here
    
    Comment sentences must have a full stop at the end. (I am only mentioning this here, from a brief look all of them miss them). See e.g. the comment right below.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1122:
    
    > 1120:   // We should only reach here from the service thread during idle time
    > 1121:   // but ensure any GC alloc regions are abandoned
    > 1122:   _allocator->abandon_gc_alloc_regions();
    
    Maybe check that they are abandoned (i.e. there are no gc alloc regions) in an assert instead of doing work.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1175:
    
    > 1173:   // This preserves the original GC-triggered shrinking behavior
    > 1174:   log_debug(gc, ergo, heap)("Heap shrink requested: removing %u regions (%zuB)",
    > 1175:                             num_regions_to_remove, shrink_bytes);
    
    This log message duplicates the one just a few lines below. Also I would prefer to keep `Heap Resize` as "topic" of the message, and in the message explain that we shrank the heap.
    I need to look at the log messages we generate as a whole, there seems to be some more repetition (one in `G1CollectedHeap::shrink()` already reads :
    > "Heap resize. Requested shrink amount: %zuB aligned shrink amount: %zuB",
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1187:
    
    > 1185:     log_debug(gc, heap)("Heap shrink details: requested=%zuB actual=%zuB "
    > 1186:                         "regions_removed=%u heap_capacity=%zuB",
    > 1187:                         shrink_bytes, shrunk_bytes, num_regions_removed, capacity());
    
    It would be nice to not repeat the same information at different levels; one can check the current level with `LogTarget::is_enabled()`, and make the one at the lower level a superset of the other.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1190:
    
    > 1188:     policy()->record_new_heap_size(num_committed_regions());
    > 1189:   } else {
    > 1190:     log_debug(gc, ergo, heap)("Did not shrink the heap (heap shrinking operation failed)");
    
    Should probably have a "Heap Resize." tag in front. And later maybe fix all this by using the existing `resizing` tag.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1239:
    
    > 1237:   // Always schedule a VM operation for proper synchronization with GC
    > 1238:   // The VM operation will re-evaluate which regions to uncommit at the time of execution
    > 1239:   VM_G1ShrinkHeap op(this, shrink_bytes);
    
    The VM operation *must* re-evaluate the decisions. There is no need to pass on this value. A GC might have been scheduled before this, and just blindly shrinking will destroy existing resizing policies.
    
    Also, the VM operation must be aware of shutdown etc; it is probably best if `VM_ShrinkHeap` inherits from `VM_GC_Operation` which does most of this stuff already.
    
    It should be a matter of amending `doit_prologue` that redoes the calculation to determine the actual regions to uncommit (and if they happen to be zero at that point after all, abort the VM operation by returning the appropriate return value there.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1241:
    
    > 1239:   VM_G1ShrinkHeap op(this, shrink_bytes);
    > 1240:   VMThread::execute(&op);
    > 1241:   return true;                       // Pages were requested to be released.
    
    The return value is never used. Remove.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1352:
    
    > 1350:   _region_attr() {
    > 1351: 
    > 1352:   _heap_evaluation_task = nullptr;
    
    Move into constructor initializer list.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1618:
    
    > 1616:     _heap_evaluation_task = new G1HeapEvaluationTask(this, _heap_sizing_policy);
    > 1617:     _service_thread->register_task(_heap_evaluation_task);
    > 1618:     log_debug(gc, init)("G1 Time-Based Heap Evaluation task registered and scheduled");
    
    Unnecessary log message. The `gc, task` log message when registering contains the same information.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1622:
    
    > 1620:     assert(_heap_evaluation_task == nullptr, "pre-condition");
    > 1621:   }
    > 1622: 
    
    I think this one is unnecessary. The variable has just been initialized to `nullptr` in the same method.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 2723:
    
    > 2721:   // Note: Region timestamps are updated automatically when regions transition to free state
    > 2722:   // via set_free() calls, so no blanket reset is needed here
    > 2723: 
    
    The change _must_ reset the timestamp otherwise it will significantly interfere with the gctimeratio/AHS based resizing. This time-based uncommit is a helper when the other does not kick in because of too infrequent gcs.
    
    E.g. consider the case when/if a GC is scheduled (put into the VM operation queue) just before the GC. GC happens, shrinks the heap, then the shrink VM operation occurs, shrinking again by an amount determined before that GC.
    
    src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 1012:
    
    > 1010:   // Deactivate a specific region by index.
    > 1011:   void deactivate_region_at(uint region_index) { _hrm.shrink_at(region_index, 1); }
    > 1012: 
    
    Unused. Remove.
    
    src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 46:
    
    > 44:   log_debug(gc, sizing)("Starting uncommit evaluation");
    > 45: 
    > 46:   ResourceMark rm; // Ensure temporary resources are released
    
    Which ones are those?
    
    src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 55:
    
    > 53:     SuspendibleThreadSetJoiner sts;
    > 54:     resize_amount = _heap_sizing_policy->evaluate_heap_resize_for_uncommit();
    > 55:   }
    
    if a GC happened just before, the evaluation should be aborted. As commented in `request_heap_shrink()` method, it must re-evaluate the decision anyway, and/or maybe even abort the VM operation in the VM operation prologue if a GC has occurred somewhere inbetween.
    
    Basically the code needs to take the `Heap_lock` to get current GC counts, passing it to the VM operation that evaluates whether it should actually run in the prologue then. I.e. something like
    
    
      {
        MutexLocker ml(Heap_lock);
        resize_amount = ...
        gc_count_before = ...
      }
    
      [...]
      if (resize_amount > 0) {
        VM_Shrink_Heap op(..., gc_count_before, ...);
        
      }
    
    src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 60:
    
    > 58: 
    > 59:   if (resize_amount > 0) {
    > 60:     log_info(gc, sizing)("Uncommit evaluation: shrinking heap by %zuMB using time-based selection", resize_amount / M);
    
    At least at info level, more context is needed.
    
    src/hotspot/share/gc/g1/g1HeapRegion.cpp line 123:
    
    > 121: void G1HeapRegion::hr_clear(bool clear_space) {
    > 122:   set_top(bottom());
    > 123:   record_activity(); // Update timestamp when region becomes available
    
    `hr_clear` unconditionally calls `set_free`, so this call is superfluous.
    
    src/hotspot/share/gc/g1/g1HeapRegion.cpp line 160:
    
    > 158:   if (!is_free()) {
    > 159:     report_region_type_change(G1HeapRegionTraceType::Free);
    > 160:     record_activity(); // Record timestamp when region becomes free
    
    Make this unconditional. At startup, the default value of the region's type is `Free`, so this will not show up if removing the call to `hr_clear()`.
    
    src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 318:
    
    > 316: }
    > 317: 
    > 318: 
    
    Unnecessary newline.
    
    src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 647:
    
    > 645:       }
    > 646:     }
    > 647:   }
    
    Please do not re-implement existing functionality like heap region iteration. Use the `G1CollectedHeap::heap_region_iterate()` API for that which does all but the `is_free()` check for you anyway.
    
    Also the `current_time` can/should be taken once at the beginning of the iteration.
    
    src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 656:
    
    > 654:   // Sort regions by access time (oldest first) using simple bubble sort
    > 655:   // This is fine since the number of empty regions is typically small
    > 656:   int n = empty_regions.length();
    
    While I do not mind the use of bubble sort (if appropriate), if performance does not matter please consider making maintenance easiest, so use the `qsort` library function.
    
    That makes the code much more compact and reviewers do not need to review the implementation for errors. (I did not even bother reviewing it).
    
    src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 686:
    
    > 684:     shrink_at(region_index, 1);
    > 685:     removed++;
    > 686:   }
    
    I think I need to see this policy in action, blindly removing all too-old regions seems to be too severe an action. 
    
    These free regions are also needed for the young gen + old gen, which has been sized appropriately. It should probably at least take them into account.
    
    src/hotspot/share/gc/g1/g1HeapRegionManager.hpp line 293:
    
    > 291:   // Used after GC operations or shrink operations that may affect region state
    > 292: 
    > 293: 
    
    Leftover comment?
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 463:
    
    > 461: 
    > 462:     virtual bool do_heap_region(G1HeapRegion* r) {
    > 463:       if (r->is_empty() && _policy->should_uncommit_region(r)) {
    
    Another wrong `is_empty()` here.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 499:
    
    > 497:         (*_inactive_regions)++;
    > 498:         // Stop early if we have enough candidates
    > 499:         if ((uint)_candidates->length() >= _max_candidates) {
    
    (Fwiw, _max_candidates is always _candidates->capacity() or so)
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 535:
    
    > 533: 
    > 534:     // Only count if region is ready for shrinking
    > 535:     if (hr->is_empty() && hr->is_free()) {
    
    `is_empty()` is wrong, already mentioned this in the last review.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 584:
    
    > 582: 
    > 583:   // Back off during allocation pressure - only evaluate when truly idle
    > 584:   if (_analytics != nullptr) {
    
    There must always be an `_analytics` instance. Potentially there could be an assert in the constructor.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 586:
    
    > 584:   if (_analytics != nullptr) {
    > 585:     double gc_time_ratio = _analytics->short_term_gc_time_ratio();
    > 586:     if (gc_time_ratio > 0.05) { // 5% GC time still indicates pressure
    
    gc time ratio is no indicator for idleness. Also the default is 4% right now (ie. which G1 aims for), this value would be higher than the goal. Also the time ratio is user-settable - some users may accept a higher gc usage than others.
    
    I understand the need for this change to kind of see if there is "pressure" - however this would be implicit by gcs occurring inbetween (i.e. if the `ServiceTask` did not occur/did not find need for uncommitting regions between GCs due to time stamps too high, then there is nothing to do anyway).
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 596:
    
    > 594:   MutexLocker ml(Heap_lock);
    > 595: 
    > 596:   ResourceMark rm; // Ensure GrowableArray resources are properly released
    
    I can't find a use of `GrowableArray` in this particular method. It may be used in a callee, but then put it there.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 598:
    
    > 596:   ResourceMark rm; // Ensure GrowableArray resources are properly released
    > 597: 
    > 598:   // Count regions eligible for uncommit (don't store them - VM operation will re-evaluate)
    
    No, the VM operation still does not re-evaluate the decision in this code. It just directly uncommits a number of random regions.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 600:
    
    > 598:   // Count regions eligible for uncommit (don't store them - VM operation will re-evaluate)
    > 599:   uint idle_count = count_uncommit_candidates();
    > 600:   uint total_regions = _g1h->max_num_regions();
    
    (This) `total_regions` is unused in this code (my IDE says so).
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 608:
    
    > 606:   if (idle_count >= G1MinRegionsToUncommit) {
    > 607:     size_t region_size = G1HeapRegion::GrainBytes;
    > 608:     size_t current_heap = _g1h->capacity();
    
    Please make this name more specific, i.e. sth like `current_capacity`.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 618:
    
    > 616:                          current_heap, min_heap, region_size, max_shrink_bytes, InitialHeapSize);
    > 617: 
    > 618:     if (max_shrink_bytes > 0 && region_size > 0) {
    
    `region_size` can never be 0.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 631:
    
    > 629: 
    > 630:       // Total regions we must keep available = young gen + G1's standard reserve
    > 631:       size_t reserved_regions = young_gen_regions + g1_reserve_regions;
    
    `G1ReservePercent` should only ever by accounted from the heap capacity. This just adds them together, need to only take into account when getting close to the maximum capacity.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 641:
    
    > 639:       // This prevents expensive re-commits during the next GC or allocation burst
    > 640:       // We add G1MinRegionsToUncommit as a small safety buffer beyond G1's standard reserves
    > 641:       size_t min_regions_after_uncommit = reserved_regions + G1MinRegionsToUncommit;
    
    The comment would rather indicate a `MAX2` use.... the name of the variable also indicates that this is a minimum regions one would want to uncommit, not an additional buffer.
    
    src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 654:
    
    > 652:       // Limited by G1MinRegionsToUncommit to avoid thrashing
    > 653:       size_t available_for_uncommit = idle_count;
    > 654:       if (available_for_uncommit < G1MinRegionsToUncommit) {
    
    Line 606 has this condition to enter here:
    
    if (idle_count >= G1MinRegionsToUncommit) {
    
    and now the code checks if `idle_count is < G1MinRegionsToUncommit)`...
    
    -------------
    
    Changes requested by tschatzl (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/26240#pullrequestreview-3487515863
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545821604
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545800333
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545884474
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545883836
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545908305
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546084792
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546005896
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546032894
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546029717
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546035489
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546040077
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546044801
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545797272
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546112119
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545940328
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546114664
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546120282
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546147069
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546152175
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545812518
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546164391
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546165826
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546349010
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546418145
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546363576
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546205486
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546255263
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546263408
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546438358
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546264762
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546288288
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546268711
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546276900
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546279093
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546329255
    
    From tschatzl at openjdk.org  Thu Nov 20 15:08:57 2025
    From: tschatzl at openjdk.org (Thomas Schatzl)
    Date: Thu, 20 Nov 2025 15:08:57 GMT
    Subject: RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods
     [v10]
    In-Reply-To: 
    References: 
     <1FJGEob7g_zkJaboRxyS0hPJ1c590V6zZipDrZ3XpOw=.08c1cf33-af6b-49de-b94d-d4f2ca8f5a22@github.com>
     
    Message-ID: <4-lSi3Xwuj2NTiGTrgWGzMATRl-QD-76XfQWE-1PQ9E=.62f09eb8-d986-48da-bb11-e59d54e503f0@github.com>
    
    On Thu, 20 Nov 2025 12:12:44 GMT, Thomas Schatzl  wrote:
    
    >> Monica Beckwith has updated the pull request incrementally with two additional commits since the last revision:
    >> 
    >>  - Fix compilation errors after master merge
    >>    
    >>    - Fix syntax error in g1CollectedHeap.cpp (extra closing brace)
    >>    - Update API call from short_term_pause_time_ratio() to short_term_gc_time_ratio()
    >>    
    >>    These fixes resolve compilation issues that occurred after merging with
    >>    upstream master due to API changes in the OpenJDK codebase.
    >>  - 8357445: Address feedback for G1 time-based heap sizing
    >>    
    >>    - Fix indentation in log_debug statement in shrink_helper (suggested by @tschatzl)
    >>    - Change terminology from 'inactive' to 'idle' throughout time-based heap sizing
    >>    - Update flag descriptions in g1_globals.hpp to use 'idle' terminology
    >>    - Fix remaining trailing whitespace in test files
    >>    
    >>    Addresses all outstanding review comments from @tschatzl
    >
    > src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1122:
    > 
    >> 1120:   // We should only reach here from the service thread during idle time
    >> 1121:   // but ensure any GC alloc regions are abandoned
    >> 1122:   _allocator->abandon_gc_alloc_regions();
    > 
    > Maybe check that they are abandoned (i.e. there are no gc alloc regions) in an assert instead of doing work.
    
    It's probably best to check that the current GC cause is `GCCause::_no_gc` (and use the `GCCauseSetter` in the VM operation) in an assert here.
    Maybe it's worth making an extra `GCCause` for this, but right now I do not see a reason. As far as I can tell, the default cause is `_no_gc` anyway... (so actually nothing to do in the VM operation).
    
    > src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 60:
    > 
    >> 58: 
    >> 59:   if (resize_amount > 0) {
    >> 60:     log_info(gc, sizing)("Uncommit evaluation: shrinking heap by %zuMB using time-based selection", resize_amount / M);
    > 
    > At least at info level, more context is needed.
    
    Other tasks use the `gc, task` tags for such status messages afaict.
    
    > src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 686:
    > 
    >> 684:     shrink_at(region_index, 1);
    >> 685:     removed++;
    >> 686:   }
    > 
    > I think I need to see this policy in action, blindly removing all too-old regions seems to be too severe an action. 
    > 
    > These free regions are also needed for the young gen + old gen, which has been sized appropriately. It should probably at least take them into account.
    
    Also the code *must* reset the timestamp for regions not selected for the reasons stated in another comment.
    
    > src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 535:
    > 
    >> 533: 
    >> 534:     // Only count if region is ready for shrinking
    >> 535:     if (hr->is_empty() && hr->is_free()) {
    > 
    > `is_empty()` is wrong, already mentioned this in the last review.
    
    All free regions are empty.
    
    > src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 641:
    > 
    >> 639:       // This prevents expensive re-commits during the next GC or allocation burst
    >> 640:       // We add G1MinRegionsToUncommit as a small safety buffer beyond G1's standard reserves
    >> 641:       size_t min_regions_after_uncommit = reserved_regions + G1MinRegionsToUncommit;
    > 
    > The comment would rather indicate a `MAX2` use.... the name of the variable also indicates that this is a minimum regions one would want to uncommit, not an additional buffer.
    
    Also, do you have evidence that this additional buffer is actually required?
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545849232
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545946650
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546167399
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546407974
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546402918
    
    From tschatzl at openjdk.org  Thu Nov 20 15:09:00 2025
    From: tschatzl at openjdk.org (Thomas Schatzl)
    Date: Thu, 20 Nov 2025 15:09:00 GMT
    Subject: RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods
     [v2]
    In-Reply-To: 
    References: 
     
     
     
    Message-ID: <0ZVPIZLcqAHQbZcXdkDlKApkZg-NsX45H0J12VheHn4=.07ea2740-d68d-453f-8cf5-e6c35c7a2e9d@github.com>
    
    On Wed, 16 Jul 2025 22:00:14 GMT, Monica Beckwith  wrote:
    
    >> src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 36:
    >> 
    >>> 34: #include "utilities/globalDefinitions.hpp"
    >>> 35: 
    >>> 36: G1HeapEvaluationTask::G1HeapEvaluationTask(G1CollectedHeap* g1h, G1HeapSizingPolicy* heap_sizing_policy) :
    >> 
    >> `G1HeapEvaluationTask` class name does not seem to clearly state the purpose of the class. But I cannot come up with a better name.
    >
    > I can see that while 'G1HeapEvaluationTask' is generic, it kind of does describe what the class does (evaluates heap sizing decisions). The time-based nature is captured in the feature flag and comments. Open to suggestions if you have a preference.
    
    Maybe `ReEvaluateHeapSizeTask` or something that gives more information about the purpose.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546048603
    
    From tschatzl at openjdk.org  Thu Nov 20 15:09:02 2025
    From: tschatzl at openjdk.org (Thomas Schatzl)
    Date: Thu, 20 Nov 2025 15:09:02 GMT
    Subject: RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods
     [v7]
    In-Reply-To: 
    References: 
     
     
     
     
    Message-ID: <4kNU86aSKhlX0mfU5t4AUGohZz0xpFZI1CL0K65PUjI=.81696fd6-2b6c-4199-b0fa-179fea98528e@github.com>
    
    On Wed, 27 Aug 2025 12:00:41 GMT, Thomas Schatzl  wrote:
    
    >> Fwiw, to avoid startup delays (calling `Ticks::now()` on thousands of regions is expensive - I did not test, but it seems reasonable and should be measured), it might be useful to do that initialization lazily (first time the background thread executes).
    >
    > There is potentially the same issue about calling `Ticks::now()` for thousands of freed regions during GC.
    
    This call is still unnecessary. The constructor will call `initialize` which calls `hr_clear()` which does the initialization. Just set it to zero (or some "invalid value" here like other members show).
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546108011
    
    From tschatzl at openjdk.org  Thu Nov 20 15:09:04 2025
    From: tschatzl at openjdk.org (Thomas Schatzl)
    Date: Thu, 20 Nov 2025 15:09:04 GMT
    Subject: RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods
     [v7]
    In-Reply-To: 
    References: 
     
     
     
    Message-ID: 
    
    On Wed, 27 Aug 2025 11:59:26 GMT, Thomas Schatzl  wrote:
    
    >> src/hotspot/share/gc/g1/g1HeapRegion.hpp line 257:
    >> 
    >>> 255:   uint _node_index;
    >>> 256: 
    >>> 257:   // Time-based heap sizing: tracks last allocation/access time
    >> 
    >> Suggestion:
    >> 
    >>   // Time-based heap sizing: tracks last allocation/access time.
    >
    > The comment is wrong/outdated too, only tracks last freeing (clearing) time. Maybe there is a better name than `_last_access_time` for the member too.
    
    Still an issue...
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546141588
    
    From tschatzl at openjdk.org  Thu Nov 20 15:09:08 2025
    From: tschatzl at openjdk.org (Thomas Schatzl)
    Date: Thu, 20 Nov 2025 15:09:08 GMT
    Subject: RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods
     [v7]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Wed, 27 Aug 2025 11:57:45 GMT, Thomas Schatzl  wrote:
    
    >> Monica Beckwith has updated the pull request incrementally with one additional commit since the last revision:
    >> 
    >>   Remove unused static _uncommit_delay member and accessor
    >
    > src/hotspot/share/gc/g1/g1HeapRegion.hpp line 564:
    > 
    >> 562:   void record_activity() {
    >> 563:     _last_access_timestamp = Ticks::now();
    >> 564:   }
    > 
    > Does not seem to be used outside `G1HeapRegion`. Remove, or better just inline since it only seems to be used once. Or put the implementation into the cpp file (and make `private`).
    
    Since my comments make this call single-use, the change can inline it.
    
    > src/hotspot/share/gc/g1/g1HeapRegion.hpp line 578:
    > 
    >> 576:     Tickspan elapsed = current_time - _last_access_timestamp;
    >> 577:     return elapsed > delay;
    >> 578:   }
    > 
    > Does not seem to be used? In general, I would tend to avoid putting policy stuff into `G1HeapRegion` which should just be a container for region state members. Also, this method is too complex imho to put into the `.hpp` file directly (should be in `.inline.hpp`), but since this method should be removed anyway....
    
    Also still unused.
    
    > src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 488:
    > 
    >> 486: bool G1HeapSizingPolicy::should_uncommit_region(G1HeapRegion* hr) const {
    >> 487:   // Note: Caller already guarantees hr->is_empty() is true
    >> 488:   // Empty regions should always be free and not in collection set in normal operation
    > 
    > Sentences should end with a full stop in comments (I am stopping commenting on this here, but there are quite a few of those).
    
    Also not fixed? I commented it again in another place....
    
    > src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 521:
    > 
    >> 519:   // Must hold Heap_lock during heap resizing
    >> 520:   MutexLocker ml(Heap_lock);
    >> 521: 
    > 
    > Since this can be entered right after a garbage collection, maybe the evaluation should be immediately exited in that case? Garbage collection already found the "right" heap size, no need for idle calculation.
    > Otoh, `get_uncommit_candidates()` should find nothing in this case.
    
    Not sure if this comment is still relevant or has been addressed. See above for basically the same comment (afaiu).
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546144314
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546146548
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546173161
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546260963
    
    From tschatzl at openjdk.org  Thu Nov 20 15:09:10 2025
    From: tschatzl at openjdk.org (Thomas Schatzl)
    Date: Thu, 20 Nov 2025 15:09:10 GMT
    Subject: RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods
     [v10]
    In-Reply-To: <4-lSi3Xwuj2NTiGTrgWGzMATRl-QD-76XfQWE-1PQ9E=.62f09eb8-d986-48da-bb11-e59d54e503f0@github.com>
    References: 
     <1FJGEob7g_zkJaboRxyS0hPJ1c590V6zZipDrZ3XpOw=.08c1cf33-af6b-49de-b94d-d4f2ca8f5a22@github.com>
     
     <4-lSi3Xwuj2NTiGTrgWGzMATRl-QD-76XfQWE-1PQ9E=.62f09eb8-d986-48da-bb11-e59d54e503f0@github.com>
    Message-ID: 
    
    On Thu, 20 Nov 2025 14:57:34 GMT, Thomas Schatzl  wrote:
    
    >> src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 535:
    >> 
    >>> 533: 
    >>> 534:     // Only count if region is ready for shrinking
    >>> 535:     if (hr->is_empty() && hr->is_free()) {
    >> 
    >> `is_empty()` is wrong, already mentioned this in the last review.
    >
    > All free regions are empty.
    
    The whole method seems unnecessary. All candidates passed are valid candidates since the `is_free()` check should have already occurred (and it did).
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546414536
    
    From eastigeevich at openjdk.org  Thu Nov 20 15:21:28 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 15:21:28 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: <2aXm6GWgRH7tBHQI0MBefmSAU9r6uZUGBmVxsb7Pevk=.869a508e-53c8-4525-b382-83279a12c3af@github.com>
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    @fisk, @xmas92 
    I added a JMH microbenchmark and its results from Graviton 2.
    I added deferred icache invalidation to ZGC where it's needed.
    
    I found one place where I am not sure: https://github.com/openjdk/jdk/blob/b9ee9541cffb6c5a737b08a69ae04472b3bcab3e/src/hotspot/share/gc/z/zHeapIterator.cpp#L364
    
    
    class ZHeapIteratorNMethodClosure : public NMethodClosure {
    private:
      OopClosure* const        _cl;
      BarrierSetNMethod* const _bs_nm;
    
    public:
      ZHeapIteratorNMethodClosure(OopClosure* cl)
        : _cl(cl),
          _bs_nm(BarrierSet::barrier_set()->barrier_set_nmethod()) {}
    
      virtual void do_nmethod(nmethod* nm) {
        // If ClassUnloading is turned off, all nmethods are considered strong,
        // not only those on the call stacks. The heap iteration might happen
        // before the concurrent processing of the code cache, make sure that
        // all nmethods have been processed before visiting the oops.
        _bs_nm->nmethod_entry_barrier(nm);
    
        ZNMethod::nmethod_oops_do(nm, _cl);
      }
    };
    ``` 
    
    For `_bs_nm->nmethod_entry_barrier(nm)` we use deferred icache invalidation. I'm not sure it is safe for `ZNMethod::nmethod_oops_do(nm, _cl)`.  It looks like it is safe because the code is called at a safepoint: `ZHeap::object_iterate`, `ZHeap::object_and_field_iterate_for_verify` and `ZHeap::parallel_object_iterator`.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3558617025
    
    From aboldtch at openjdk.org  Thu Nov 20 15:26:46 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 15:26:46 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: 
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    I think the implementation is fine. We can always extend it later if we find that other platforms or hardware needs this sort of treatment.
    
    My knowledge and experience with arm hardware implementation specifics are rather lacking. So I cannot comment on the validity of the assertions made here w.r.t. only invalidating the first instruction in the nmethod etc.
    
    Hopefully some of our resident arm experts can chime in.
    
    -------------
    
    Marked as reviewed by aboldtch (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/28328#pullrequestreview-3488451104
    
    From eastigeevich at openjdk.org  Thu Nov 20 15:32:40 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 15:32:40 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: <3bQyg5gwLN8cQyCu3lohteMWQlbXIueLRgy9UI2Dhgk=.abb8339c-74f9-4ffa-b89d-b75b6056226e@github.com>
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    @fisk, @xmas92
    The added microbenchmark shows interesting regressions when an nmethod has no accesses to object's fields:
    
    Benchmark                       Score     Error  Units
    GCPatchingNmethodCost.fullGC:base                       73.937 ?  17.764  ms/op
    GCPatchingNmethodCost.systemGC:base                     77.495 ?  11.963  ms/op
    GCPatchingNmethodCost.youngGC:base                      9.955 ?   1.649  ms/op
    GCPatchingNmethodCost.fullGC:fix                        88.865 ? 19.299  ms/op +20.1%
    GCPatchingNmethodCost.systemGC:fix                      90.572 ? 14.750  ms/op +16.9%
    GCPatchingNmethodCost.youngGC:fix                       10.219 ?  0.877  ms/op +2.7%
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3558673810
    
    From aboldtch at openjdk.org  Thu Nov 20 15:32:43 2025
    From: aboldtch at openjdk.org (Axel Boldt-Christmas)
    Date: Thu, 20 Nov 2025 15:32:43 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: 
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    src/hotspot/cpu/aarch64/icache_aarch64.hpp line 63:
    
    > 61:     // the performance impact due to this workaround."
    > 62:     //
    > 63:     // As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    
    > As the address for icache invalidation is not relevant
    
    Is this only because of the Neoverse-N1 workaround?
    
    _If that is the case we could reach this point if `NeoverseN1Errata1542419` is either set by the user or mislabeled on some CPU without this workaround._
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2546530821
    
    From eosterlund at openjdk.org  Thu Nov 20 15:39:36 2025
    From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
    Date: Thu, 20 Nov 2025 15:39:36 GMT
    Subject: RFR: 8367319: Add os interfaces to get machine and container
     values separately [v4]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 13:21:07 GMT, Casper Norrbin  wrote:
    
    >> Hi everyone,
    >> 
    >> The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples:
    >> 
    >> - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different.
    >> - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number.
    >> 
    >> To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values.
    >> 
    >> In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment.
    >> 
    >> `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`.
    >> 
    >> Testing:
    >> - Oracle tiers 1-5
    >> - Container tests on cgroup v1 and v2 hosts.
    >
    > Casper Norrbin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:
    > 
    >  - Merge branch 'master' into separate-container-machine-values
    >  - Merge branch 'master' into separate-container-machine-values
    >  - Move methods to Machine/Container inner classes + clarifying documentation
    >  - Merge branch 'master' into separate-container-machine-values
    >  - Fixed print type
    >  - separate-machine-container-functions
    
    Marked as reviewed by eosterlund (Reviewer).
    
    -------------
    
    PR Review: https://git.openjdk.org/jdk/pull/27646#pullrequestreview-3488512169
    
    From aph at openjdk.org  Thu Nov 20 15:54:55 2025
    From: aph at openjdk.org (Andrew Haley)
    Date: Thu, 20 Nov 2025 15:54:55 GMT
    Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to
     reduce binary size
    In-Reply-To: 
    References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com>
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 14:36:31 GMT, Matthias Baesken  wrote:
    
    > > On further consideration, I remembered that many of the helpers are debug-only, so that shouldn't be a problem
    > 
    > Thanks for looking into it. I can disable the dead_strip switch for the debug-builds, so it would not eliminate these helpers.
    
    Right, but we'll need to make sure we don't remove the non-debug ones. We don't want to exclude stuff in `src/hotspot/share/utilities/debug.cpp` or explicit helpers in the back ends.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3558785722
    
    From mbaesken at openjdk.org  Thu Nov 20 16:00:30 2025
    From: mbaesken at openjdk.org (Matthias Baesken)
    Date: Thu, 20 Nov 2025 16:00:30 GMT
    Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to
     reduce binary size [v3]
    In-Reply-To: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com>
    References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com>
    Message-ID: 
    
    > The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols.
    > Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) :
    > (before -> after setting the option)
    > 
    > 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib
    > 264K -> 248K images/jdk/lib/libjavajpeg.dylib
    > 152K -> 132K images/jdk/lib/libjli.dylib
    > 388K -> 296K images/jdk/lib/liblcms.dylib
    > 164K -> 128K images/jdk/lib/libzip.dylib
    > 
    > 
    > and libjvm :
    > 
    > 20M -> 18M images/jdk/lib/server/libjvm.dylib
    > 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM
    
    Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
    
      Fix Windows issues
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28319/files
      - new: https://git.openjdk.org/jdk/pull/28319/files/07251ffe..b41966b8
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28319&range=02
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28319&range=01-02
    
      Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod
      Patch: https://git.openjdk.org/jdk/pull/28319.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28319/head:pull/28319
    
    PR: https://git.openjdk.org/jdk/pull/28319
    
    From mbaesken at openjdk.org  Thu Nov 20 16:08:23 2025
    From: mbaesken at openjdk.org (Matthias Baesken)
    Date: Thu, 20 Nov 2025 16:08:23 GMT
    Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Wed, 12 Nov 2025 15:46:09 GMT, Matthias Baesken  wrote:
    
    >> Identical code folding can reduce the size of some libs, especially libjvm. However not all linkers support the flag/feature so we have to limit it to some environments.
    >> See for example this manpage :
    >> https://manpages.debian.org/testing/lld-7/ld.lld-7.1
    >> 
    >> 
    >> sizes of libjvm.so with / without -icf=all
    >> linux aarch64 : 25888 / 27112 K
    >> linux x86_64 : 27952 / 29072 K
    >> 
    >> 
    >> (for most other native libs the identical code folding has no effect, because there is nothing to fold)
    >
    > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   Limit icf to release builds
    
    sizes of libjvm.so with / without -icf=all
    linux x86_64 : 27952 / 29072 K
    
    
    With `-icf=safe`  it is on linux x86_64   ` 28656 K `  , so the lib is still smaller but not as much as with ` -icf=all` .
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28236#issuecomment-3558843260
    
    From aph at openjdk.org  Thu Nov 20 16:28:45 2025
    From: aph at openjdk.org (Andrew Haley)
    Date: Thu, 20 Nov 2025 16:28:45 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: <85dBRXdwqMOffQvXGI9J_zhfLnwZ0LrY_Wj4w0nrpbM=.88de1041-c752-49aa-8ab2-600b92f8559d@github.com>
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    src/hotspot/cpu/aarch64/icache_aarch64.hpp line 64:
    
    > 62:     //
    > 63:     // As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > 64:     ICache::invalidate_word(_nm->code_begin());
    
    Rather than call `ICache::invalidate_word()`, I believe we should explicitly execute the instructions in the workaround.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2546750740
    
    From mbaesken at openjdk.org  Thu Nov 20 16:31:25 2025
    From: mbaesken at openjdk.org (Matthias Baesken)
    Date: Thu, 20 Nov 2025 16:31:25 GMT
    Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to
     reduce binary size [v3]
    In-Reply-To: 
    References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com>
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 16:00:30 GMT, Matthias Baesken  wrote:
    
    >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols.
    >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) :
    >> (before -> after setting the option)
    >> 
    >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib
    >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib
    >> 152K -> 132K images/jdk/lib/libjli.dylib
    >> 388K -> 296K images/jdk/lib/liblcms.dylib
    >> 164K -> 128K images/jdk/lib/libzip.dylib
    >> 
    >> 
    >> and libjvm :
    >> 
    >> 20M -> 18M images/jdk/lib/server/libjvm.dylib
    >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM
    >
    > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   Fix Windows issues
    
    Interestingly,  we seem to rely already on the 'used' attribute in the OpenJDK codebase at some places, see
    https://github.com/search?q=repo%3Aopenjdk%2Fjdk%20ATTRIBUTE_USED&type=code
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3558965415
    
    From aph at openjdk.org  Thu Nov 20 16:37:22 2025
    From: aph at openjdk.org (Andrew Haley)
    Date: Thu, 20 Nov 2025 16:37:22 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: 
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    I think we'll also want a workaround for `CodeBuffer::relocate_code_to()`.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3558978597
    
    From eastigeevich at openjdk.org  Thu Nov 20 16:37:25 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 16:37:25 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 15:27:37 GMT, Axel Boldt-Christmas  wrote:
    
    > Is this only because of the Neoverse-N1 workaround?
    
    Yes, it is.
    
    > If that is the case we could reach this point if NeoverseN1Errata1542419 is either set by the user or mislabeled on some CPU without this workaround.
    
    We only set NeoverseN1Errata1542419 to true if CPU is Neoverse N1 with the errata and it is not set by an user. We rely on Linux kernel cpuinfo correctly providing us information about Neoverse N1 revision. I think it's worth to check explicitly all affected revisions. This will mitigate Linux kernels not correctly setting revisions.
    We can issue a warning if an user sets it to true and CPU is not Neoverse N1 with the errata.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2546769534
    
    From eastigeevich at openjdk.org  Thu Nov 20 16:37:29 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 16:37:29 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
     <85dBRXdwqMOffQvXGI9J_zhfLnwZ0LrY_Wj4w0nrpbM=.88de1041-c752-49aa-8ab2-600b92f8559d@github.com>
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 16:32:51 GMT, Evgeny Astigeevich  wrote:
    
    >> src/hotspot/cpu/aarch64/icache_aarch64.hpp line 64:
    >> 
    >>> 62:     //
    >>> 63:     // As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    >>> 64:     ICache::invalidate_word(_nm->code_begin());
    >> 
    >> Rather than call `ICache::invalidate_word()`, I believe we should explicitly execute the instructions in the workaround.
    >
    > We cannot execute `tlbi vae3is` here because it requires EL3. We are at EL0.
    
    Or you mean `IC IVAU`?`
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2546787152
    
    From eastigeevich at openjdk.org  Thu Nov 20 16:37:28 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 16:37:28 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: <85dBRXdwqMOffQvXGI9J_zhfLnwZ0LrY_Wj4w0nrpbM=.88de1041-c752-49aa-8ab2-600b92f8559d@github.com>
    References: 
     <85dBRXdwqMOffQvXGI9J_zhfLnwZ0LrY_Wj4w0nrpbM=.88de1041-c752-49aa-8ab2-600b92f8559d@github.com>
    Message-ID: 
    
    On Thu, 20 Nov 2025 16:25:03 GMT, Andrew Haley  wrote:
    
    >> Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >>  
    >> Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    >> - Disable coherent icache.
    >> - Trap IC IVAU instructions.
    >> - Execute:
    >>    - `tlbi vae3is, xzr`
    >>    - `dsb sy`
    >>  
    >>  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >>  
    >> As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    >> 
    >> "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    >> 
    >> This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    >> 
    >> Changes include:
    >> 
    >> * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    >> * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    >> * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    >> * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    >> 
    >> Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    >> 
    >> - Baseline
    >> 
    >> $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1...
    >
    > src/hotspot/cpu/aarch64/icache_aarch64.hpp line 64:
    > 
    >> 62:     //
    >> 63:     // As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    >> 64:     ICache::invalidate_word(_nm->code_begin());
    > 
    > Rather than call `ICache::invalidate_word()`, I believe we should explicitly execute the instructions in the workaround.
    
    We cannot execute `tlbi vae3is` here because it requires EL3. We are at EL0.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2546778840
    
    From mbaesken at openjdk.org  Thu Nov 20 16:44:08 2025
    From: mbaesken at openjdk.org (Matthias Baesken)
    Date: Thu, 20 Nov 2025 16:44:08 GMT
    Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to
     reduce binary size [v3]
    In-Reply-To: 
    References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com>
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 16:00:30 GMT, Matthias Baesken  wrote:
    
    >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols.
    >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) :
    >> (before -> after setting the option)
    >> 
    >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib
    >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib
    >> 152K -> 132K images/jdk/lib/libjli.dylib
    >> 388K -> 296K images/jdk/lib/liblcms.dylib
    >> 164K -> 128K images/jdk/lib/libzip.dylib
    >> 
    >> 
    >> and libjvm :
    >> 
    >> 20M -> 18M images/jdk/lib/server/libjvm.dylib
    >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM
    >
    > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   Fix Windows issues
    
    For the Apple linker there seem to be also a way to mark some sections with `S_ATTR_NO_DEAD_STRIP` to avoid the dead strip operation
    https://maskray.me/blog/2021-02-28-linker-garbage-collection
    'Sections with the S_ATTR_NO_DEAD_STRIP flag'
    But I am not sure how to do this in the HS code directly.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3559017189
    
    From cjplummer at openjdk.org  Thu Nov 20 17:23:47 2025
    From: cjplummer at openjdk.org (Chris Plummer)
    Date: Thu, 20 Nov 2025 17:23:47 GMT
    Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to
     reduce binary size
    In-Reply-To: 
    References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com>
     
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 15:51:55 GMT, Andrew Haley  wrote:
    
    > > > On further consideration, I remembered that many of the helpers are debug-only, so that shouldn't be a problem
    > > 
    > > 
    > > Thanks for looking into it. I can disable the dead_strip switch for the debug-builds, so it would not eliminate these helpers.
    > 
    > Right, but we'll need to make sure we don't remove the non-debug ones. We don't want to exclude stuff in `src/hotspot/share/utilities/debug.cpp` or explicit helpers in the back ends.
    
    Most of the debug.cpp helpers are in product builds now. I think only pns() and pns2() are left out of product builds.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3559202033
    
    From erikj at openjdk.org  Thu Nov 20 17:42:25 2025
    From: erikj at openjdk.org (Erik Joelsson)
    Date: Thu, 20 Nov 2025 17:42:25 GMT
    Subject: RFR: 8371346: ZGC: Flexible heap base selection [v2]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 15:04:10 GMT, Axel Boldt-Christmas  wrote:
    
    >> ZGC reserves a virtual address range for its heap with one high order bit set which is referred to as the heap base. Internally we then often represent heap addresses as offset from this heap base.
    >> 
    >> Currently we select one specific heap base at the start based on MaxHeapSize and the current system properties.
    >> 
    >> With instrumented builds, or custom launchers it may be that we are unable to reserve a usable address range using that heap base. Currently we just give up if this happens and exits the VM.
    >> 
    >> This is problematic when using instrumented builds such as ASAN where there are certain address ranges it uses which often clash with the default ZGC heap base.
    >> 
    >> I propose that we are more flexible when selecting the heap base, and we start as we do today at our preferred location, but are able to retry other compatible heap bases within some broader limits.
    >> 
    >> The implementation will now start at the recommended or required heap base which ever is larger and try to first reserve the desired reservation size (normally 16 * MaxHeapSize). If no heap base can accommodate this desired size, it will attempt to find at least the required size and use that.
    >> 
    >> On linux x86_64 we will now also probe for the heap base rather than hard coding the max heap base as we did previously. This is beneficial when there are address space restrictions (such as with ASAN), and when there are none, we only do a couple of extra system calls at most. 
    >> 
    >> There are some changes to the gc+init logging. The ZAddressOffsetMax is adjusted to always be a correct upper bound. And the exit path when reservation fails is clean up, so that we exit early when we know that the external virtual memory limits will prohibit the heap reservation. 
    >> 
    >> Performance testing show no significant differences.
    >> 
    >> Testing:
    >> * GHA
    >> * Running ZGC tier1-8 on Oracle supported platforms
    >
    > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision:
    > 
    >  - Small fixes
    >  - Merge remote-tracking branch 'upstream_jdk/master' into stefank_review_pr_28161
    >  - Fixes and cleanups
    >  - pr/28161_review
    >  - Merge remote-tracking branch 'upstream/master' into pr/28161
    >  - Initial Test Implementation
    >  - Initial implementation flexible heap base
    >  - Constrain ZAddressOffsetMax correctly when multi-partition fails
    >  - Log reserved size correctly when multi-partition fails
    >  - Cleanup headers
    >  - ... and 1 more: https://git.openjdk.org/jdk/compare/52f64ca1...1d7b2374
    
    Build change looks ok.
    
    -------------
    
    Marked as reviewed by erikj (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/28161#pullrequestreview-3489100120
    
    From eastigeevich at openjdk.org  Thu Nov 20 17:58:02 2025
    From: eastigeevich at openjdk.org (Evgeny Astigeevich)
    Date: Thu, 20 Nov 2025 17:58:02 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 16:30:54 GMT, Andrew Haley  wrote:
    
    > I think we'll also want a workaround for `CodeBuffer::relocate_code_to()`.
    
    Also we need to fix `G1NMethodClosure::do_evacuation_and_fixup` and `ShenandoahNMethod::oops_do`. They use `nmethod::fix_oop_relocations`.
    
    Should we do it in this PR or in separate PRs?
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3559330327
    
    From aph at openjdk.org  Thu Nov 20 19:13:15 2025
    From: aph at openjdk.org (Andrew Haley)
    Date: Thu, 20 Nov 2025 19:13:15 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
     
     
    Message-ID: <_BLA0EOxl2gTXmq40tjVb7c-UGoD7nRjoYilJ_XFjBg=.75e9f378-7cd3-4f06-acab-6424877407bd@github.com>
    
    On Thu, 20 Nov 2025 17:54:50 GMT, Evgeny Astigeevich  wrote:
    
    > > I think we'll also want a workaround for `CodeBuffer::relocate_code_to()`.
    > 
    > Also we need to fix `G1NMethodClosure::do_evacuation_and_fixup` and `ShenandoahNMethod::oops_do`. They use `nmethod::fix_oop_relocations`.
    > 
    > Should we do it in this PR or in separate PRs?
    
    Please, let's handle it all here.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3559624499
    
    From kbarrett at openjdk.org  Thu Nov 20 20:09:40 2025
    From: kbarrett at openjdk.org (Kim Barrett)
    Date: Thu, 20 Nov 2025 20:09:40 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v8]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 10:37:21 GMT, Anton Artemov  wrote:
    
    >> Hi, 
    >> 
    >> please consider the following changes:
    >> 
    >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. 
    >> 
    >> Tested in tiers 1 - 5.
    >
    > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   8366671: Addressed reviewers' comments.
    
    A couple very minor nits, but I don't object as-is.
    
    src/hotspot/share/runtime/objectMonitor.cpp line 2040:
    
    > 2038: bool ObjectMonitor::notify_internal(JavaThread* current) {
    > 2039:   bool did_notify = false;
    > 2040:   {
    
    I don't think this extra level of block scope is needed. The only thing outside the end of this
    extra scope is the `return did_notify`, which could just as well be inside. Your call...
    
    src/hotspot/share/utilities/spinCriticalSection.hpp line 43:
    
    > 41: 
    > 42:   // Low-level leaf-lock primitives used to implement synchronization.
    > 43:   // Not for general synchronization use.
    
    This comment seems like it contains info that ought to be part of the class description, rather
    than on an implementation detail.
    
    -------------
    
    Marked as reviewed by kbarrett (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3489707063
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2547502382
    PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2547512683
    
    From pchilanomate at openjdk.org  Thu Nov 20 20:52:05 2025
    From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
    Date: Thu, 20 Nov 2025 20:52:05 GMT
    Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes
     without JVMTI agent [v3]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes.
    > 
    > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes:
    > 
    > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable tra
     nsitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version.
    > 
    > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`.
    > 
    > - The code was previously structured in terms of mount and unm...
    
    Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
    
      Add Alan's comment in VirtualThread
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28361/files
      - new: https://git.openjdk.org/jdk/pull/28361/files/976486cd..205ae77b
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=02
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=01-02
    
      Stats: 12 lines in 1 file changed: 11 ins; 0 del; 1 mod
      Patch: https://git.openjdk.org/jdk/pull/28361.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361
    
    PR: https://git.openjdk.org/jdk/pull/28361
    
    From pchilanomate at openjdk.org  Thu Nov 20 20:55:22 2025
    From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
    Date: Thu, 20 Nov 2025 20:55:22 GMT
    Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes
     without JVMTI agent [v3]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 19:03:13 GMT, Patricio Chilano Mateo  wrote:
    
    >> src/java.base/share/classes/java/lang/VirtualThread.java line 1390:
    >> 
    >>> 1388:     }
    >>> 1389: 
    >>> 1390:     // -- JVM TI support --
    >> 
    >> We'll need to update is comment as it no longer only for JVMTI. 
    >> 
    >> This might be a good place for a block comment to define "transitions" covering the changing of thread identity the continuation mount/unmount, and how the notification to the VM support JVMTI and handshakes.  Maybe I could contribute a block comment to include here?
    >
    > That would be great.
    
    Thanks, added the suggested comment.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547640337
    
    From dlong at openjdk.org  Thu Nov 20 21:25:31 2025
    From: dlong at openjdk.org (Dean Long)
    Date: Thu, 20 Nov 2025 21:25:31 GMT
    Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact
     on GenZGC performance
    In-Reply-To: 
    References: 
    Message-ID: <6O0YDvGtf8yNNsqgZeZtyJlk6GlGVjXDKwOX-JcUIi4=.6c669dbd-4653-4282-93ef-8129d7c13bdd@github.com>
    
    On Fri, 14 Nov 2025 18:16:55 GMT, Evgeny Astigeevich  wrote:
    
    > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
    >  
    > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
    > - Disable coherent icache.
    > - Trap IC IVAU instructions.
    > - Execute:
    >    - `tlbi vae3is, xzr`
    >    - `dsb sy`
    >  
    >  `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
    >  
    > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
    > 
    > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
    > 
    > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
    > 
    > Changes include:
    > 
    > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
    > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
    > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
    > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
    > 
    > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
    > 
    > - Baseline
    > 
    > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC...
    
    It seems a little disruptive to have to pass `defer_icache_invalidation` around so much.  What about attaching this information to the Thread or using a THREAD_LOCAL?
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3560101041
    
    From sviswanathan at openjdk.org  Thu Nov 20 21:34:56 2025
    From: sviswanathan at openjdk.org (Sandhya Viswanathan)
    Date: Thu, 20 Nov 2025 21:34:56 GMT
    Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements
     [v2]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Mon, 17 Nov 2025 23:35:44 GMT, Volodymyr Paprotski  wrote:
    
    >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline 
    >>    - `SignatureBench.MLDSA` is 1.2x-2.2x faster
    >>    - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7)
    >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version 
    >>   - `SignatureBench.MLDSA` is upto 5% faster, never slower
    >> 
    >> Note on intrinsic:
    >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill.
    >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2
    >> 
    >> Tests and benchmarks:
    >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result
    >> - Added benchmark to measure the performance of intrinsic itself
    >> 
    >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java"
    >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2"
    >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1"
    >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1"
    >
    > Volodymyr Paprotski has updated the pull request incrementally with two additional commits since the last revision:
    > 
    >  - whitespace
    >  - address first comments
    
    src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 1283:
    
    > 1281:     // r1 = r1 & quotient; // copy 0 or keep as is, using EqMsk as filter
    > 1282:     for (int i = 0; i < regCnt; i++) {
    > 1283:       // FIXME: replace with void evmovdqul(Address dst, KRegister mask, XMMRegister src, bool merge, int vector_len);?
    
    Is the fixme a leftover?
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2547729185
    
    From dholmes at openjdk.org  Thu Nov 20 22:10:18 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 22:10:18 GMT
    Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error:
     applying non-zero offset 1073741824 to null pointer [v11]
    In-Reply-To: 
    References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com>
     
    Message-ID: 
    
    On Thu, 30 Oct 2025 12:06:00 GMT, Afshin Zafari  wrote:
    
    >> The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB.
    >> The fix is using `unitptr_t` instead of `address`/`char*`, etc.  In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases.
    >> 
    >> Tests:
    >> linux-x64 tier1
    >
    > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   fix arguments.cpp for HeapMinBaseAddress type.
    
    Okay nothing further from me. Thanks
    
    -------------
    
    Marked as reviewed by dholmes (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/26955#pullrequestreview-3490162115
    
    From dholmes at openjdk.org  Thu Nov 20 22:16:18 2025
    From: dholmes at openjdk.org (David Holmes)
    Date: Thu, 20 Nov 2025 22:16:18 GMT
    Subject: RFR: 8364343: ThreadSnapshotFactory::get_thread_snapshot() crashes
     without JVMTI agent [v3]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 20:52:05 GMT, Patricio Chilano Mateo  wrote:
    
    >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes.
    >> 
    >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes:
    >> 
    >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. 
    >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version.
    >> 
    >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`.
    >> 
    >> - The code was previously structured in t...
    >
    > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   Add Alan's comment in VirtualThread
    
    As this involves a fairly significant design change I suggest updating the JBS issue to have a more informative title e.g. "Virtual Thread transition management needs to be independent of JVM TI"
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3560304232
    
    From coleenp at openjdk.org  Thu Nov 20 22:29:01 2025
    From: coleenp at openjdk.org (Coleen Phillimore)
    Date: Thu, 20 Nov 2025 22:29:01 GMT
    Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease
     [v8]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 10:37:21 GMT, Anton Artemov  wrote:
    
    >> Hi, 
    >> 
    >> please consider the following changes:
    >> 
    >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. 
    >> 
    >> Tested in tiers 1 - 5.
    >
    > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   8366671: Addressed reviewers' comments.
    
    I like the new reduced version of this a lot.  Thank you for working through the comments and discussions.  I think this turned out well.
    I don't think it should be a commonly used utility but it's in a central place for code that needs it, like ObjectMonitor, can use it and not have to dig through thread to find something like this.
    
    -------------
    
    Marked as reviewed by coleenp (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3490236503
    
    From vpaprotski at openjdk.org  Thu Nov 20 22:55:07 2025
    From: vpaprotski at openjdk.org (Volodymyr Paprotski)
    Date: Thu, 20 Nov 2025 22:55:07 GMT
    Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements
     [v3]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline 
    >    - `SignatureBench.MLDSA` is 1.2x-2.2x faster
    >    - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7)
    > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version 
    >   - `SignatureBench.MLDSA` is upto 5% faster, never slower
    > 
    > Note on intrinsic:
    > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill.
    > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2
    > 
    > Tests and benchmarks:
    > - Added a fuzz test to ensure Java and intrinsic produces exactly same result
    > - Added benchmark to measure the performance of intrinsic itself
    > 
    > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java"
    > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2"
    > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1"
    > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1"
    
    Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision:
    
      next set of comments
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28136/files
      - new: https://git.openjdk.org/jdk/pull/28136/files/e9133401..b04f4f0d
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=02
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=01-02
    
      Stats: 424 lines in 2 files changed: 1 ins; 423 del; 0 mod
      Patch: https://git.openjdk.org/jdk/pull/28136.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28136/head:pull/28136
    
    PR: https://git.openjdk.org/jdk/pull/28136
    
    From sviswanathan at openjdk.org  Thu Nov 20 23:09:54 2025
    From: sviswanathan at openjdk.org (Sandhya Viswanathan)
    Date: Thu, 20 Nov 2025 23:09:54 GMT
    Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements
     [v3]
    In-Reply-To: 
    References: 
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 22:55:07 GMT, Volodymyr Paprotski  wrote:
    
    >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline 
    >>    - `SignatureBench.MLDSA` is 1.2x-2.2x faster
    >>    - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7)
    >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version 
    >>   - `SignatureBench.MLDSA` is upto 5% faster, never slower
    >> 
    >> Note on intrinsic:
    >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill.
    >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2
    >> 
    >> Tests and benchmarks:
    >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result
    >> - Added benchmark to measure the performance of intrinsic itself
    >> 
    >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java"
    >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2"
    >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1"
    >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1"
    >
    > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   next set of comments
    
    Looks good to me.
    
    -------------
    
    Marked as reviewed by sviswanathan (Reviewer).
    
    PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3490441448
    
    From pchilanomate at openjdk.org  Thu Nov 20 23:10:48 2025
    From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
    Date: Thu, 20 Nov 2025 23:10:48 GMT
    Subject: RFR: 8364343: Virtual Thread transition management needs to be
     independent of JVM TI [v4]
    In-Reply-To: 
    References: 
    Message-ID: 
    
    > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes.
    > 
    > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes:
    > 
    > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. 
    > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version.
    > 
    > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`.
    > 
    > - The code was previously structured in terms of mount and un...
    
    Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
    
      Rename VM methods for endFirstTransition/startFinalTransition
    
    -------------
    
    Changes:
      - all: https://git.openjdk.org/jdk/pull/28361/files
      - new: https://git.openjdk.org/jdk/pull/28361/files/205ae77b..10534b33
    
    Webrevs:
     - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=03
     - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=02-03
    
      Stats: 20 lines in 8 files changed: 0 ins; 0 del; 20 mod
      Patch: https://git.openjdk.org/jdk/pull/28361.diff
      Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361
    
    PR: https://git.openjdk.org/jdk/pull/28361
    
    From pchilanomate at openjdk.org  Thu Nov 20 23:10:49 2025
    From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
    Date: Thu, 20 Nov 2025 23:10:49 GMT
    Subject: RFR: 8364343: Virtual Thread transition management needs to be
     independent of JVM TI [v3]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 22:14:15 GMT, David Holmes  wrote:
    
    > As this involves a fairly significant design change I suggest updating the JBS issue to have a more informative title e.g. "Virtual Thread transition management needs to be independent of JVM TI"
    >
    Yes, that's better. Updated.
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3560533185
    
    From pchilanomate at openjdk.org  Thu Nov 20 23:10:52 2025
    From: pchilanomate at openjdk.org (Patricio Chilano Mateo)
    Date: Thu, 20 Nov 2025 23:10:52 GMT
    Subject: RFR: 8364343: Virtual Thread transition management needs to be
     independent of JVM TI [v4]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Wed, 19 Nov 2025 19:01:18 GMT, Patricio Chilano Mateo  wrote:
    
    >> src/java.base/share/native/libjava/VirtualThread.c line 38:
    >> 
    >>> 36:     { "startFinalTransition",     "()V",  (void *)&JVM_VirtualThreadEnd },
    >>> 37:     { "startTransition",          "(Z)V", (void *)&JVM_VirtualThreadStartTransition },
    >>> 38:     { "endTransition",            "(Z)V", (void *)&JVM_VirtualThreadEndTransition },
    >> 
    >> I wonder if JVM_VirtualThreadStart and JVM_VirtualThreadEnd should be renamed to have EndFirstTransition and StartFinalTransaction in the names so it's easy to follow through from the Java code down to MountUnmountDisabler::start_transition/end_transition.
    >
    > How about removing these methods and just have an extra boolean parameter in `start/endTransition`?
    > https://github.com/pchilano/jdk/compare/JDK-8364343...pchilano:jdk:startEndTransitionsOnly
    
    I renamed the methods as suggested. I remembered that we separated ThreadStart/ThreadEnd in 8306028 for future improvements related to JVMTI. Not sure if that?s still relevant but in any case probably better to leave that discussion for a separate bug.
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547987864
    
    From vpaprotski at openjdk.org  Thu Nov 20 23:17:36 2025
    From: vpaprotski at openjdk.org (Volodymyr Paprotski)
    Date: Thu, 20 Nov 2025 23:17:36 GMT
    Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements
     [v2]
    In-Reply-To: 
    References: 
     
     
     
     
    Message-ID: <-NP71XXG0bisxVHds8O-uXhLZqbnVLijJoJDwVq2ZBk=.2478c442-fc34-4ba0-9811-1f910ee3ee36@github.com>
    
    On Wed, 19 Nov 2025 22:40:41 GMT, Sergey Kuksenko  wrote:
    
    > I understand your reasons. The question is whether you'll need the microbenchmark in the future. If no (or probably no), please remove the micro. If needed, please move it from the "org.openjdk.bench.javax.crypto.full" package to "org.openjdk.bench.javax.crypto". It is supposed to have only public API micros in packages "small" and "full"
    
    @kuksenko I decided to just remove it. If anyone wants it back, its in my git history (I usually keep my branches after merge..)
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3560569194
    
    From vpaprotski at openjdk.org  Thu Nov 20 23:17:39 2025
    From: vpaprotski at openjdk.org (Volodymyr Paprotski)
    Date: Thu, 20 Nov 2025 23:17:39 GMT
    Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements
     [v2]
    In-Reply-To: 
    References: 
     
     
    Message-ID: 
    
    On Thu, 20 Nov 2025 21:31:31 GMT, Sandhya Viswanathan  wrote:
    
    >> Volodymyr Paprotski has updated the pull request incrementally with two additional commits since the last revision:
    >> 
    >>  - whitespace
    >>  - address first comments
    >
    > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 1283:
    > 
    >> 1281:     // r1 = r1 & quotient; // copy 0 or keep as is, using EqMsk as filter
    >> 1282:     for (int i = 0; i < regCnt; i++) {
    >> 1283:       // FIXME: replace with void evmovdqul(Address dst, KRegister mask, XMMRegister src, bool merge, int vector_len);?
    > 
    > Is the fixme a leftover?
    
    Yes. Removed. (I think I was considering merging this instruction with the storeXmm, but there really isnt a good way to do that)
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2548005781
    
    From vlivanov at openjdk.org  Thu Nov 20 23:36:43 2025
    From: vlivanov at openjdk.org (Vladimir Ivanov)
    Date: Thu, 20 Nov 2025 23:36:43 GMT
    Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v3]
    In-Reply-To: 
    References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com>
     
    Message-ID: <_C1_-yzeixcKbR2NfmnM4MEl3InsR6cTTzmoT-vMSBY=.032aae46-e951-4c76-91e6-fc7a8fe8b73c@github.com>
    
    On Wed, 19 Nov 2025 12:34:30 GMT, Coleen Phillimore  wrote:
    
    >> ArrayKlass doesn't set AccessFlags so don't look for them there.  See CR for details.
    >> Fixed SA and jvmci.  @iwanowww Can you check that I changed C2 correctly (we talked about this in August).
    >> Tested with tier1-4.  5-7 in progress.
    >
    > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
    > 
    >   Revert a couple more InstanceKlass::casts also to get GHA to restart.
    
    src/hotspot/share/opto/compile.cpp line 1729:
    
    > 1727:       if (flat->offset() == in_bytes(Klass::super_check_offset_offset()))
    > 1728:         alias_type(idx)->set_rewritable(false);
    > 1729:       if (flat->isa_instklassptr() && flat->offset() == in_bytes(InstanceKlass::access_flags_offset()))
    
    I'd place the check separately. Otherwise, looks good. 
    
    diff --git a/src/hotspot/share/opto/compile.cpp b/src/hotspot/share/opto/compile.cpp
    index 6babc13e1b3..9215c0fc03f 100644
    --- a/src/hotspot/share/opto/compile.cpp
    +++ b/src/hotspot/share/opto/compile.cpp
    @@ -1726,8 +1726,6 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr
           }
           if (flat->offset() == in_bytes(Klass::super_check_offset_offset()))
             alias_type(idx)->set_rewritable(false);
    -      if (flat->offset() == in_bytes(Klass::access_flags_offset()))
    -        alias_type(idx)->set_rewritable(false);
           if (flat->offset() == in_bytes(Klass::misc_flags_offset()))
             alias_type(idx)->set_rewritable(false);
           if (flat->offset() == in_bytes(Klass::java_mirror_offset()))
    @@ -1735,6 +1733,11 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr
           if (flat->offset() == in_bytes(Klass::secondary_super_cache_offset()))
             alias_type(idx)->set_rewritable(false);
         }
    +    if (flat->isa_instklassptr()) {
    +      if (flat->offset() == in_bytes(InstanceKlass::access_flags_offset())) {
    +        alias_type(idx)->set_rewritable(false);
    +      }
    +    }
         // %%% (We would like to finalize JavaThread::threadObj_offset(),
         // but the base pointer type is not distinctive enough to identify
         // references into JavaThread.)
    
    -------------
    
    PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2548046511
    
    From vlivanov at openjdk.org  Thu Nov 20 23:43:30 2025
    From: vlivanov at openjdk.org (Vladimir Ivanov)
    Date: Thu, 20 Nov 2025 23:43:30 GMT
    Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements
     [v2]
    In-Reply-To: <-NP71XXG0bisxVHds8O-uXhLZqbnVLijJoJDwVq2ZBk=.2478c442-fc34-4ba0-9811-1f910ee3ee36@github.com>
    References: 
     
     
     
     
     <-NP71XXG0bisxVHds8O-uXhLZqbnVLijJoJDwVq2ZBk=.2478c442-fc34-4ba0-9811-1f910ee3ee36@github.com>
    Message-ID: 
    
    On Thu, 20 Nov 2025 23:13:41 GMT, Volodymyr Paprotski  wrote:
    
    > If anyone wants it back, its in my git history (I usually keep my branches after merge..)
    
    You could put a comment with the link into JBS issue to make it easier to discover later. (Or just attach the source file there.)
    
    -------------
    
    PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3560656214
    
    From sparasa at openjdk.org  Thu Nov 20 23:56:16 2025
    From: sparasa at openjdk.org (Srinivas Vamsi Parasa)
    Date: Thu, 20 Nov 2025 23:56:16 GMT
    Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512
    Message-ID: 
    
    The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively.
    
    To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size.
    
    
    ### **Performance comparison for byte array fills in a loop for 1 million times**
    
    
    
    UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] -- | -- | -- | -- 1 | 0.46 | 0.14 | 0.263 2 | 0.46 | 0.16 | 0.264 5 | 0.46 | 0.29 | 0.30 10 | 0.46 | 0.58 | 0.32 15 | 0.46 | 0.42 | 0.276 16 | 0.46 | 0.46 | 0.32 17 | 0.21 | 0.5 | 0.3 20 | 0.21 | 0.37 | 0.3 25 | 0.21 | 0.59 | 0.288 31 | 0.21 | 0.53 | 0.284 32 | 0.21 | 0.58 | 0.322 35 | 0.5 | 0.77 | 0.29 40 | 0.5 | 0.61 | 0.367 45 | 0.5 | 0.52 | 0.324 48 | 0.5 | 0.66 | 0.368 49 | 0.22 | 0.69 | 0.342 50 | 0.22 | 0.78 | 0.346 55 | 0.22 | 0.67 | 0.3 60 | 0.22 | 0.67 | 0.322 64 | 0.22 | 0.82 | 0.362 70 | 0.51 | 1.1 | 0.32 80 | 0.49 | 0.89 | 0.37 90 | 0.225 | 0.68 | 0.343 100 | 0.54 | 1.09 | 0.41 110 | 0.6 | 0.98 | 0.36 120 | 0.26 | 0.75 | 0.386 128 | 0.266 | 1.1 | 0.402
    ------------- Commit messages: - 8349452: Fix performance regression for Arrays.fill() with AVX512 Changes: https://git.openjdk.org/jdk/pull/28442/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8349452 Stats: 120 lines in 2 files changed: 101 ins; 5 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From duke at openjdk.org Thu Nov 20 23:56:48 2025 From: duke at openjdk.org (Ruben) Date: Thu, 20 Nov 2025 23:56:48 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v4] In-Reply-To: References: Message-ID: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. Ruben has updated the pull request incrementally with one additional commit since the last revision: Refine `first_check_size` definitions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28192/files - new: https://git.openjdk.org/jdk/pull/28192/files/3a014376..00ea0e14 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28192&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28192&range=02-03 Stats: 5 lines in 5 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28192.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28192/head:pull/28192 PR: https://git.openjdk.org/jdk/pull/28192 From duke at openjdk.org Fri Nov 21 00:02:37 2025 From: duke at openjdk.org (Ruben) Date: Fri, 21 Nov 2025 00:02:37 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: References: <-h6G9ajUWQwDRcUMOtyI_YCUCkXz3pzRggJk_UaxM-0=.a8c772aa-2f09-48c0-9cfb-17e624393eb0@github.com> Message-ID: On Wed, 19 Nov 2025 04:28:04 GMT, Dean Long wrote: >>> However, I still have not identified a way to ensure the deopt handler stub ends at a page boundary in a unit test. >> >> The latest update implements an alternative way to detect the failure early during testing - via the newly added assertion in the `emit_deopt_handler`. >> >> @adinn, @dean-long, @TheRealMDoerr, would it be possible for you to review the latest version of the PR? >> >> Is there any additional testing you would recommend to perform before this can be integrated? > >> Is there any additional testing you would recommend to perform before this can be integrated? > > Oracle likes to make sure the final version passes in our CI. I got burned last time testing an earlier version and not the final version. Thank you for the reviews, @dean-long, @adinn, @TheRealMDoerr. @dean-long, I've updated the patch according to your suggestions - also applied the same to other `nativeInst_*` files except `nativeInst_x86.hpp` as the instruction size is not fixed in this case. > the final version passes in our CI If this version of the patch looks suitable, would it be possible to start the CI testing please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3560702086 From dholmes at openjdk.org Fri Nov 21 01:00:58 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:00:58 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition Hi Patricio, this is another significant piece of work. I have taken an initial pass through trying to digest the main parts - can't comment on the C2 code or the Java side. I have made a few minor comments/suggestions. Thanks src/hotspot/share/prims/jvm.cpp line 3668: > 3666: if (!DoJVMTIVirtualThreadTransitions) { > 3667: assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); > 3668: return; Does this not still need checking somewhere? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162: > 160: // be executed once we go back to Java. If this is an unmount, the handshake that the > 161: // disabler executed against this carrier thread already provided the needed synchronization. > 162: // This matches the release fence in xx_enable_for_one()/xx_enable_for_all(). Subtle. Do we have comments where the fences are to ensure people realize the fence is serving this purpose? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 277: > 275: > 276: // Start of the critical region. Prevent future memory > 277: // operations to be ordered before we read the transition flag. Does this refer to `java_lang_Thread::is_in_VTMS_transition(_vthread())`? If so perhaps that should internally perform the `load_acquire`? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 278: > 276: // Start of the critical region. Prevent future memory > 277: // operations to be ordered before we read the transition flag. > 278: // This matches the release fence in end_transition(). Suggestion: // This pairs with the release fence in end_transition(). src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307: > 305: // Block while some mount/unmount transitions are in progress. > 306: // Debug version fails and prints diagnostic information. > 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { This looks very odd, having an assignment in the loop condition check and no actual loop-update expression. src/hotspot/share/runtime/mountUnmountDisabler.cpp line 316: > 314: // operations to be ordered before we read the transition flags. > 315: // This matches the release fence in end_transition(). > 316: OrderAccess::acquire(); Surely the use of the iterator already provides the necessary ordering guarantee here as well. ? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 327: > 325: // End of the critical section. Prevent previous memory operations to > 326: // be ordered after we clear the clear the disable transition flag. > 327: // This matches the equivalent acquire fence in start_transition(). Suggestion: // This pairs with the acquire in start_transition(). I just realized you are using "fence" to describe release and acquire memory barrier semantics. Given we have an operation `fence` I find this confusing for the reader - especially when we also have a `release_store_fence` operation which might be confused with "release fence". src/hotspot/share/runtime/mountUnmountDisabler.cpp line 370: > 368: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); > 369: assert(_global_start_transition_disable_count >= 0, ""); > 370: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count + 1); Suggestion: AtomicAccess::inc(&_global_start_transition_disable_count); src/hotspot/share/runtime/mountUnmountDisabler.cpp line 376: > 374: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); > 375: assert(_global_start_transition_disable_count > 0, ""); > 376: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count - 1); Suggestion: AtomicAccess::dec(&_global_start_transition_disable_count); src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52: > 50: // parameter is_SR: suspender or resumer > 51: MountUnmountDisabler(bool exlusive = false); > 52: MountUnmountDisabler(oop thread_oop); What does the comment mean here? ------------- PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3490207826 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547887801 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548145054 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548157390 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548150552 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548160373 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548161340 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548168223 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548169846 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548170787 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548174392 From dholmes at openjdk.org Fri Nov 21 01:01:00 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:01:00 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 20:52:05 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Add Alan's comment in VirtualThread src/hotspot/share/classfile/javaClasses.cpp line 1757: > 1755: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); > 1756: int val = AtomicAccess::load(addr); > 1757: AtomicAccess::store(addr, val + 1); Suggestion: AtomicAccess::inc(addr); src/hotspot/share/classfile/javaClasses.cpp line 1764: > 1762: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); > 1763: int val = AtomicAccess::load(addr); > 1764: AtomicAccess::store(addr, val - 1); Suggestion: AtomicAccess::dec(addr); src/hotspot/share/opto/runtime.hpp line 740: > 738: return vthread_transition_Type(); > 739: } > 740: I do not know C2 but this looks really strange - 4 different functions all return the same thing. ??? src/hotspot/share/runtime/handshake.cpp line 374: > 372: JavaThread* target = java_lang_Thread::thread(carrier_thread); > 373: assert(target != nullptr, ""); > 374: // Technically there is need for a ThreadsListHandle since the target Suggestion: // Technically there is no need for a ThreadsListHandle since the target ? src/hotspot/share/runtime/mountUnmountDisabler.cpp line 147: > 145: MonitorLocker ml(VTMSTransition_lock); > 146: while (is_start_transition_disabled(current, vth())) { > 147: ml.wait(200); I see a lot of timed-waits throughout this code. Is that because we poll rather than synchronizing properly? All this potential busy-waiting is surely going to cause performance glitches. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547864726 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547863852 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547884313 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547900707 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547963241 From dholmes at openjdk.org Fri Nov 21 01:04:59 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:04:59 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 00:35:38 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162: > >> 160: // be executed once we go back to Java. If this is an unmount, the handshake that the >> 161: // disabler executed against this carrier thread already provided the needed synchronization. >> 162: // This matches the release fence in xx_enable_for_one()/xx_enable_for_all(). > > Subtle. Do we have comments where the fences are to ensure people realize the fence is serving this purpose? I also forgot to suggest a wording change: say "pairs with" rather than "matches". Reading back through I realize now I have misunderstood many of these comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548189421 From sparasa at openjdk.org Fri Nov 21 01:13:47 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 21 Nov 2025 01:13:47 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v2] In-Reply-To: References: Message-ID: > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > >
    > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.263 > 2 | 0.46 | 0.16 | 0.264 > 5 | 0.46 | 0.29 | 0.30 > 10 | 0.46 | 0.58 | 0.32 > 15 | 0.46 | 0.42 | 0.276 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.3 > 20 | 0.21 | 0.37 | 0.3 > 25 | 0.21 | 0.59 | 0.288 > 31 | 0.21 | 0.53 | 0.284 > 32 | 0.21 | 0.58 | 0.322 > 35 | 0.5 | 0.77 | 0.29 > 40 | 0.5 | 0.61 | 0.367 > 45 | 0.5 | 0.52 | 0.324 > 48 | 0.5 | 0.66 | 0.368 > 49 | 0.22 | 0.69 | 0.342 > 50 | 0.22 | 0.78 | 0.346 > 55 | 0.22 | 0.67 | 0.3 > 60 | 0.22 | 0.67 | 0.322 > 64 | 0.22 | 0.82 | 0.362 > 70 | 0.51 | 1.1 | 0.32 > 80 | 0.49 | 0.89 | 0.37 > 90 | 0.225 | 0.68 | 0.343 > 100 | 0.54 | 1.09 | 0.41 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.402 > > > >
    Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: undo size check for fill64_masked ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/f18b385e..ee1db381 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=00-01 Stats: 9 lines in 1 file changed: 0 ins; 9 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From dholmes at openjdk.org Fri Nov 21 01:26:21 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Nov 2025 01:26:21 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition > we follow the classic Dekker pattern for the required synchronization. My understanding is that Dekker requires a "full fence" between the accesses, not just ordering memory barriers. The two variables involved must be published to all readers for the algorithm to work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3560918527 From kvn at openjdk.org Fri Nov 21 01:28:23 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 21 Nov 2025 01:28:23 GMT Subject: RFR: 8347248: Fingerprinter::size_of_parameters() should not be used for getting number of parameters In-Reply-To: References: <7Fcu3CgMjNmMCmAsDggYG_ArF-VvTvyyWS4iNJSn4yo=.cdb91fc1-e72c-4c19-8a00-96062f511284@github.com> Message-ID: On Wed, 19 Nov 2025 17:53:10 GMT, Ioi Lam wrote: >> So it looks like the use of "arg" here refers to "arg slot" meaning these variables and methods could be renamed to be more clear. What do you think of renaming methods like `arg_modified` to `arg_slot_modified`? > > I think it's OK to rename `arg_count` to `arg_size`. There's quite a lot of existing code that does this. `arg_size` is understood to be the "number of slots". > > https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/opto/graphKit.cpp#L2365-L2366 > > https://github.com/openjdk/jdk/blob/9ea8201b7494fe9107d4abd78c02ac765a5751d4/src/hotspot/share/ci/bcEscapeAnalyzer.cpp#L192-L196 > > I am not sure about adding "slot" to "arg_modified". While there are some use of the word "slot" in the compiler APIs, it's not common. I vote for `size_of_args` suggested by Dean. I would rename local variable too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28380#discussion_r2548222005 From fyang at openjdk.org Fri Nov 21 03:41:46 2025 From: fyang at openjdk.org (Fei Yang) Date: Fri, 21 Nov 2025 03:41:46 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v5] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: On Tue, 18 Nov 2025 09:27:44 GMT, Hamlin Li wrote: >> Hi, >> >> This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. >> >> This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. >> >> Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. >> >> # Test >> ## Jtreg >> >> in progress... >> >> ## Performance >> >> Column names meanings: >> * p: with patch >> * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> * m: without patch >> * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> >> #### Average improvement >> >> NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. >> >> For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. >> >> Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) >> -- | -- | -- | -- >> 1.022782609 | 2.198717391 | 2.162673913 | 2.199 >> >> > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > replace assert with log_warning src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1590: > 1588: // jump if cmp1 < cmp2 or either is NaN > 1589: // not jump (i.e. move src to dst) if cmp1 >= cmp2 > 1590: float_blt(cmp1, cmp2, no_set); I compared this with the existing `MacroAssembler::cmov_cmp_fp_ge` [1] and I witnessed some difference in the case of `NaN` handling. In `MacroAssembler::cmov_cmp_fp_ge`, we set the `is_unordered` param to true when calling `float_blt` or `double_blt`, which is not the case here. I assume we need similar handling here as well, right? [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L1338 src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1636: > 1634: // jump if cmp1 <= cmp2 or either is NaN > 1635: // not jump (i.e. move src to dst) if cmp1 > cmp2 > 1636: float_ble(cmp1, cmp2, no_set); Same question here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2548424215 PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2548424568 From vpetko at openjdk.org Fri Nov 21 05:26:10 2025 From: vpetko at openjdk.org (Vladimir Petko) Date: Fri, 21 Nov 2025 05:26:10 GMT Subject: RFR: 8352567: [s390x] disable JFR tests requiring JFR stubs Message-ID: JFR stubs are not [implemented](https://github.com/openjdk/jdk/blame/06ba6cf3a137a6cdf572a876a46d18e51c248451/src/hotspot/cpu/s390/sharedRuntime_s390.cpp#L3412). Add platform requirement to JFR tests that require JFR stubs to skip them on S390x. Testing: - s390x: ============================== Test summary ============================== TEST TOTAL PASS FAIL ERROR SKIP jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java 0 0 0 0 0 jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java 0 0 0 0 0 jtreg:test/jdk/jdk/jfr 630 577 0 0 53 ============================== TEST SUCCESS - amd64: ============================== Test summary ============================== TEST TOTAL PASS FAIL ERROR SKIP jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java 1 1 0 0 0 jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java 1 1 0 0 0 jtreg:test/jdk/jdk/jfr 629 622 0 0 7 ============================== TEST SUCCESS ------------- Commit messages: - fix(lint): add missing comma - chore: disable JFR tests on s390x Changes: https://git.openjdk.org/jdk/pull/28444/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28444&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352567 Stats: 22 lines in 20 files changed: 20 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28444.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28444/head:pull/28444 PR: https://git.openjdk.org/jdk/pull/28444 From qpzhang at openjdk.org Fri Nov 21 06:37:46 2025 From: qpzhang at openjdk.org (Patrick Zhang) Date: Fri, 21 Nov 2025 06:37:46 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v7] In-Reply-To: References: Message-ID: On Fri, 17 Oct 2025 04:19:42 GMT, Patrick Zhang wrote: >> Issue: >> In AArch64 port, `UseBlockZeroing` is by default set to true and `BlockZeroingLowLimit` is initialized to 256. If `DC ZVA` is supported, `BlockZeroingLowLimit` is later updated to `4 * VM_Version::zva_length()`. When `UseBlockZeroing` is set to false, all related conditional checks should ignore `BlockZeroingLowLimit`. However, the function `MacroAssembler::zero_words(Register base, uint64_t cnt)` still evaluates the lower limit and bases its code generation logic on it, which seems to be an incomplete conditional check. >> >> This PR: >> 1. Reset `BlockZeroingLowLimit` to `4 * VM_Version::zva_length()` or 256 with a warning message if it was manually configured from the default while `UseBlockZeroing` is disabled. >> 2. Added necessary comments in `MacroAssembler::zero_words(Register base, uint64_t cnt)` and `MacroAssembler::zero_words(Register ptr, Register cnt)` to explain why we do not check `UseBlockZeroing` in the outer part of these functions. Instead, the decision is delegated to the stub function `zero_blocks`, which encapsulates the DC ZVA instructions and serves as the inner implementation of `zero_words`. This approach helps better control the increase in code cache size during array or object instance initialization. >> 3. Added more testing sizes to `test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java` to better cover scenarios involving smaller arrays and objects.. >> >> Tests: >> 1. Performance tests on the bundled JMH `vm.compiler.ClearMemory`, and `vm.gc.RawAllocationRate` (including `arrayTest` and `instanceTest`) showed no obvious regression. Negative tests with `jdk/bin/java -jar images/test/micro/benchmarks.jar RawAllocationRate.arrayTest_C1 -bm thrpt -gc false -wi 0 -w 30 -i 1 -r 30 -t 1 -f 1 -tu s -jvmArgs "-XX:-UseBlockZeroing -XX:BlockZeroingLowLimit=8" -p size=32` demonstrated good wall times on `zero_words_reg_imm` calls, as expected. >> 2. Jtreg ter1 test on Ampere Altra, AmpereOne, Graviton2 and 3, tier2 on Altra. No new issues found. Passed tests of GHA Sanity Checks. > > Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: > > Refine the count types to pass mac and win builds > > Signed-off-by: Patrick Zhang Hi, The status of this PR has not been updated for a couple of weeks. I think I?ve addressed the feedback provided, but I haven?t seen any further comments or decisions. Please let me know if there?s anything else I can do to improve the patch or if it?s ready to move forward. Thanks for your time! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26917#issuecomment-3561617302 From duke at openjdk.org Fri Nov 21 07:46:22 2025 From: duke at openjdk.org (=?UTF-8?B?SmVhbi1Ob8OrbA==?= Rouvignac) Date: Fri, 21 Nov 2025 07:46:22 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v2] In-Reply-To: References: Message-ID: <-6UQvhvzWxm9r6rtvS4EjiVn-to2xAa64MJi-9_-zss=.fe61f491-31d9-4e75-932a-8e9a9f6a45d1@github.com> On Thu, 20 Nov 2025 04:45:24 GMT, Serguei Spitsyn wrote: >> This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame. >> >> This fix is to avoid enabling the `interp-only` mode for threads when `FramePop` events are enabled with JVMTI `SetEventNotificationMode`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` in the function `InterpreterRuntime::frequency_counter_overflow_inner()`. Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked. >> The other details will be provided in the first PR request comment. >> It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed). >> >> Testing: >> - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage >> - submitted mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > cleanup: removed an old code fragment in frame.cpp test/jdk/com/sun/jdi/EATests.java line 3068: > 3066: // frame[4]: EATestsTarget.main(java.lang.String[]) > 3067: > 3068: env.stepOverLine(thread); // needed to keep target thread interp-only, so dontinline_brkpt_iret is not inligned inligned => inlined ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28407#discussion_r2548831869 From mbaesken at openjdk.org Fri Nov 21 08:33:24 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 21 Nov 2025 08:33:24 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: On Wed, 19 Nov 2025 13:05:24 GMT, Matthias Baesken wrote: > As usual, maybe it is possible to split this patch between hotspot and the libraries? Maybe we should for now limit the dead_strip to the JDK native libs ? This would avoid the trouble with the HS debug helpers coding and the serviceability agent (but in the long run we have this trouble also with linktime-gc and probably LTO which can be enabled by OpenJDK configure, not only with this macOS-linker related flag this PR is about ). ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3561953592 From shade at openjdk.org Fri Nov 21 09:13:46 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 09:13:46 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 09:06:54 GMT, Aleksey Shipilev wrote: > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Sample experiments show this saves ~1.6% of code: $ for I in `seq 1 3`; do build/linux-x86_64-server-release/images/jdk/bin/java -Xcomp -XX:+CITime 2>&1 | grep "nmethod code"; done # Before nmethod code size : 5764304 bytes nmethod code size : 5764336 bytes nmethod code size : 5764480 bytes # After (-1.6%) nmethod code size : 5670136 bytes nmethod code size : 5670136 bytes nmethod code size : 5670168 bytes $ for I in `seq 1 3`; do build/linux-x86_64-server-release/images/jdk/bin/java -Xcomp -XX:+CITime Hello.java 2>&1 | grep "nmethod code"; done # Before nmethod code size : 25394184 bytes nmethod code size : 25394552 bytes nmethod code size : 25393968 bytes # After (-1.6%) nmethod code size : 24988544 bytes nmethod code size : 24991696 bytes nmethod code size : 24991040 bytes ------------- PR Comment: https://git.openjdk.org/jdk/pull/28446#issuecomment-3562073445 From shade at openjdk.org Fri Nov 21 09:13:45 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 09:13:45 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code Message-ID: We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. Additional testing: - [x] Linux x86_64 server fastdebug, `tier1` - [ ] Linux x86_64 server fastdebug, `all` ------------- Commit messages: - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short - More touchups - Also optimize queue insertion - Touchups - WIP Changes: https://git.openjdk.org/jdk/pull/28446/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372285 Stats: 71 lines in 1 file changed: 21 ins; 28 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/28446.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28446/head:pull/28446 PR: https://git.openjdk.org/jdk/pull/28446 From aartemov at openjdk.org Fri Nov 21 09:25:27 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 21 Nov 2025 09:25:27 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v9] In-Reply-To: References: Message-ID: <3bBZzigernOcTkARE9am0ZmHR9NWsmp3xa0ksSLYiE8=.981d0f10-5b41-40c8-a35c-f953b0d1df08@github.com> > Hi, > > please consider the following changes: > > In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. > > Tested in tiers 1 - 5. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8366671: Addressed reviewer's comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28264/files - new: https://git.openjdk.org/jdk/pull/28264/files/8ad139ab..4952da2c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28264&range=07-08 Stats: 65 lines in 2 files changed: 7 ins; 9 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/28264.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28264/head:pull/28264 PR: https://git.openjdk.org/jdk/pull/28264 From aartemov at openjdk.org Fri Nov 21 09:25:31 2025 From: aartemov at openjdk.org (Anton Artemov) Date: Fri, 21 Nov 2025 09:25:31 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v8] In-Reply-To: References: Message-ID: <3i7NLxDP3ghzUoa1k4mIs5WiMFcqTgwA8MNi2k5D35E=.5af83c29-b036-4c49-a21d-dd0725fec08f@github.com> On Thu, 20 Nov 2025 20:01:09 GMT, Kim Barrett wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8366671: Addressed reviewers' comments. > > src/hotspot/share/runtime/objectMonitor.cpp line 2040: > >> 2038: bool ObjectMonitor::notify_internal(JavaThread* current) { >> 2039: bool did_notify = false; >> 2040: { > > I don't think this extra level of block scope is needed. The only thing outside the end of this > extra scope is the `return did_notify`, which could just as well be inside. Your call... Agree, removed. > src/hotspot/share/utilities/spinCriticalSection.hpp line 43: > >> 41: >> 42: // Low-level leaf-lock primitives used to implement synchronization. >> 43: // Not for general synchronization use. > > This comment seems like it contains info that ought to be part of the class description, rather > than on an implementation detail. I modified it a bit and moved to the class description. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2549069235 PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2549068563 From tschatzl at openjdk.org Fri Nov 21 09:29:14 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 21 Nov 2025 09:29:14 GMT Subject: RFR: 8372179: Remove Unused ConcurrentHashTable::MultiGetHandle Message-ID: Hi all, please review the removal of `ConcurrentHashTable::MultiGetHandle` which is never used anywhere but in gtests. Testing: gha Thanks, Thomas ------------- Commit messages: - * remove whitespace - d - * remove Changes: https://git.openjdk.org/jdk/pull/28396/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28396&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372179 Stats: 37 lines in 3 files changed: 0 ins; 34 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28396.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28396/head:pull/28396 PR: https://git.openjdk.org/jdk/pull/28396 From mdoerr at openjdk.org Fri Nov 21 09:33:24 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 21 Nov 2025 09:33:24 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:56:48 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Refine `first_check_size` definitions Still good. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28192#pullrequestreview-3491889191 From tschatzl at openjdk.org Fri Nov 21 09:51:26 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 21 Nov 2025 09:51:26 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 09:06:54 GMT, Aleksey Shipilev wrote: > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Changes requested by tschatzl (Reviewer). src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 131: > 129: __ cmpptr(addr, count); > 130: __ jcc(Assembler::belowEqual, loop); > 131: __ jmpb(done); Not related to this line, but for `jcc` there is also a `jccb` variant that could be used (line 121); you actually used it in other code. Since these short jumps have a signed displacement, they can also be used for backward jumps. (E.g. in below `__jmp(next_card)`, but maybe I'm overlooking something. src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 209: > 207: // Jump out if done, or fall-through to runtime > 208: __ bind(L_null); > 209: __ jmp(L_done); Maybe add a comment that we do not know the distance of `L_done` we use the long form or something (I assume that's the reason for not using `jmpb` here). ------------- PR Review: https://git.openjdk.org/jdk/pull/28446#pullrequestreview-3491945514 PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549146822 PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549149717 From shade at openjdk.org Fri Nov 21 09:56:57 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 09:56:57 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 09:45:54 GMT, Thomas Schatzl wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [ ] Linux x86_64 server fastdebug, `all` > > src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 131: > >> 129: __ cmpptr(addr, count); >> 130: __ jcc(Assembler::belowEqual, loop); >> 131: __ jmpb(done); > > Not related to this line, but for `jcc` there is also a `jccb` variant that could be used (line 121); you actually used it in other code. Since these short jumps have a signed displacement, they can also be used for backward jumps. (E.g. in below `__jmp(next_card)`, but maybe I'm overlooking something. Backward jumps are actually shortened automatically, because assembler knows when their offset is small. Only forward branches need this special (forward-looking) treatment: assembler has no advanced knowledge the jump can be short, so we have to tell it. This is our SOP: rely on automatic shortening where possible for backward branches, shorten the forward branches where it is obvious. Yes, I think we can shorten `__ jcc(Assembler::equal, is_clean_card);` too, let me try that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549169909 From shade at openjdk.org Fri Nov 21 10:02:50 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 10:02:50 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v2] In-Reply-To: References: Message-ID: > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: - Comment - Shorten a few more branches ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28446/files - new: https://git.openjdk.org/jdk/pull/28446/files/6250ea8a..edbc74a2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=00-01 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28446.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28446/head:pull/28446 PR: https://git.openjdk.org/jdk/pull/28446 From tschatzl at openjdk.org Fri Nov 21 10:02:52 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 21 Nov 2025 10:02:52 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v2] In-Reply-To: References: Message-ID: <8QyzpBwjNWQiM2uH4kRmzeuNLvDALA0zQsAEKRnFYFA=.9bdb5cd7-1328-463c-81e0-86b7c09f7a6f@github.com> On Fri, 21 Nov 2025 09:52:28 GMT, Aleksey Shipilev wrote: >> src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 131: >> >>> 129: __ cmpptr(addr, count); >>> 130: __ jcc(Assembler::belowEqual, loop); >>> 131: __ jmpb(done); >> >> Not related to this line, but for `jcc` there is also a `jccb` variant that could be used (line 121); you actually used it in other code. Since these short jumps have a signed displacement, they can also be used for backward jumps. (E.g. in below `__jmp(next_card)`, but maybe I'm overlooking something. > > Backward jumps are actually shortened automatically, because assembler knows when their offset is small. Only forward branches need this special (forward-looking) treatment: assembler has no advanced knowledge the jump can be short, so we have to tell it. This is our SOP: rely on automatic shortening where possible for backward branches, shorten the forward branches where it is obvious. > > Yes, I think we can shorten `__ jcc(Assembler::equal, is_clean_card);` too, let me try that. Okay, thanks for the information, I am good with that. Although I think it would not hurt to make it explicit that we cared. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549190865 From shade at openjdk.org Fri Nov 21 10:02:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 10:02:53 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v2] In-Reply-To: References: Message-ID: <-9vq439TDdNapNR06UwN05Fp5yXWN_VWQqWnEAkpWpY=.521ca031-6d71-4a09-a9b2-840aeb01b9d5@github.com> On Fri, 21 Nov 2025 09:46:59 GMT, Thomas Schatzl wrote: >> Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: >> >> - Comment >> - Shorten a few more branches > > src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 209: > >> 207: // Jump out if done, or fall-through to runtime >> 208: __ bind(L_null); >> 209: __ jmp(L_done); > > Maybe add a comment that we do not know the distance of `L_done` we use the long form or something (I assume that's the reason for not using `jmpb` here). I don't see significant value in a comment like that, but why not, added. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549197438 From shade at openjdk.org Fri Nov 21 10:07:35 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 10:07:35 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: References: Message-ID: > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Make some backward branches explicitly short ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28446/files - new: https://git.openjdk.org/jdk/pull/28446/files/edbc74a2..1f57d0d9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28446.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28446/head:pull/28446 PR: https://git.openjdk.org/jdk/pull/28446 From tschatzl at openjdk.org Fri Nov 21 10:07:35 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 21 Nov 2025 10:07:35 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 10:04:56 GMT, Aleksey Shipilev wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [ ] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Make some backward branches explicitly short Looks good, thanks. I assume that the jmh writebarrier micros were run just in case. Fwiw, also the GHA failures earlier looked like infra issues. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28446#pullrequestreview-3492026049 From shade at openjdk.org Fri Nov 21 10:07:37 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 10:07:37 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: <8QyzpBwjNWQiM2uH4kRmzeuNLvDALA0zQsAEKRnFYFA=.9bdb5cd7-1328-463c-81e0-86b7c09f7a6f@github.com> References: <8QyzpBwjNWQiM2uH4kRmzeuNLvDALA0zQsAEKRnFYFA=.9bdb5cd7-1328-463c-81e0-86b7c09f7a6f@github.com> Message-ID: On Fri, 21 Nov 2025 09:58:07 GMT, Thomas Schatzl wrote: >> Backward jumps are actually shortened automatically, because assembler knows when their offset is small. Only forward branches need this special (forward-looking) treatment: assembler has no advanced knowledge the jump can be short, so we have to tell it. This is our SOP: rely on automatic shortening where possible for backward branches, shorten the forward branches where it is obvious. >> >> Yes, I think we can shorten `__ jcc(Assembler::equal, is_clean_card);` too, let me try that. > > Okay, thanks for the information, I am good with that. Although I think it would not hurt to make it explicit that we cared. Right, we can make some of these backward branches explicitly short, just to drive the point home. Done in new commit, let's see if testing complains about any new shortenings. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549210137 From shade at openjdk.org Fri Nov 21 10:10:32 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 10:10:32 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 10:07:35 GMT, Aleksey Shipilev wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [ ] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Make some backward branches explicitly short GHA failure is due to https://github.com/openjdk/jdk/pull/28445. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28446#issuecomment-3562313261 From shade at openjdk.org Fri Nov 21 10:49:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 10:49:14 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: References: <8QyzpBwjNWQiM2uH4kRmzeuNLvDALA0zQsAEKRnFYFA=.9bdb5cd7-1328-463c-81e0-86b7c09f7a6f@github.com> Message-ID: On Fri, 21 Nov 2025 10:03:16 GMT, Aleksey Shipilev wrote: >> Okay, thanks for the information, I am good with that. Although I think it would not hurt to make it explicit that we cared. > > Right, we can make some of these backward branches explicitly short, just to drive the point home. Done in new commit, let's see if testing complains about any new shortenings. tier1 passes, running more tests now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549343272 From roland at openjdk.org Fri Nov 21 11:19:51 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 21 Nov 2025 11:19:51 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) [v3] In-Reply-To: References: Message-ID: > The test case has an out of loop `Store` with an `AddP` address > expression that has other uses and is in the loop body. Schematically, > only showing the address subgraph and the bases for the `AddP`s: > > > Store#195 -> AddP#133 -> AddP#134 -> CastPP#110 > -> CastPP#110 > > > Both `AddP`s have the same base, a `CastPP` that's also in the loop > body. > > That loop is a counted loop and only has 3 iterations so is fully > unrolled. First, one iteration is peeled: > > > /-> CastPP#110 > Store#195 -> Phi#360 -> AddP#133 -> AddP#134 -> CastPP#110 > -> AddP#277 -> AddP#278 -> CastPP#283 > -> CastPP#283 > > > > The `AddP`s and `CastPP` are cloned (because in the loop body). As > part of peeling, `PhaseIdealLoop::peeled_dom_test_elim()` is > called. It finds the test that guards `CastPP#283` in the peeled > iteration dominates and replaces the test that guards `CastPP#110` > (the test in the peeled iteration is the clone of the test in the > loop). That causes `CastPP#110`'s control to be updated to that of the > test in the peeled iteration and to be yanked from the loop. So now > `CastPP#283` and `CastPP#110` have the same inputs. > > Next unrolling happens: > > > /-> CastPP#110 > /-> AddP#400 -> AddP#401 -> CastPP#110 > Store#195 -> Phi#360 -> Phi#477 -> AddP#133 -> AddP#134 -> CastPP#110 > \ -> CastPP#110 > -> AddP#277 -> AddP#278 -> CastPP#283 > -> CastPP#283 > > > > `AddP`s are cloned once more but not the `CastPP`s because they are > both in the peeled iteration now. A new `Phi` is added. > > Next igvn runs. It's going to push the `AddP`s through the `Phi`s. > > Through `Phi#477`: > > > > /-> CastPP#110 > Store#195 -> Phi#360 -> AddP#510 -> Phi#509 -> AddP#401 -> CastPP#110 > \ -> AddP#134 -> CastPP#110 > -> AddP#277 -> AddP#278 -> CastPP#283 > -> CastPP#283 > > > > Through `Phi#360`: > > > /-> AddP#134 -> CastPP#110 > /-> Phi#509 -> AddP#401 -> CastPP#110 > Store#195 -> AddP#516 -> Phi#515 -> AddP#278 -> CastPP#283 > -> Phi#514 -> CastPP#283 > -> CastP#110 > > > Then `Phi#514` which has 2 `CastPP`s as input with identical inputs is > transformed into anot... Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'master' into JDK-8351889 - verif - Merge branch 'master' into JDK-8351889 - test seed - more - Merge branch 'master' into JDK-8351889 - Merge branch 'master' into JDK-8351889 - more - test - fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25386/files - new: https://git.openjdk.org/jdk/pull/25386/files/bf984838..d52f2ded Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25386&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25386&range=01-02 Stats: 525999 lines in 5163 files changed: 366993 ins; 99823 del; 59183 mod Patch: https://git.openjdk.org/jdk/pull/25386.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25386/head:pull/25386 PR: https://git.openjdk.org/jdk/pull/25386 From roland at openjdk.org Fri Nov 21 11:19:51 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 21 Nov 2025 11:19:51 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 10:15:18 GMT, Roland Westrelin wrote: > This could be part of `VerifyIterativeGVN`. I added some verification code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25386#issuecomment-3562560562 From alanb at openjdk.org Fri Nov 21 11:35:57 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 21 Nov 2025 11:35:57 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 19:55:24 GMT, Mat Carter wrote: >> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. >> >> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE >> >> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: >> >> TRUE >> FALSE >> >> Passes tier1 on linux (x64) and windows (x64) > > Mat Carter has updated the pull request incrementally with one additional commit since the last revision: > > Remove single whitespace I went through the plumbing to check the registration with the platform MBeanServer and everything looks okay (and consistent with how the other JDK-specific management interfaces are setup and registered). src/jdk.management/share/classes/jdk/management/HotSpotAOTCacheMXBean.java line 97: > 95: * successfully; {@code false} otherwise. > 96: */ > 97: public boolean endRecording(); Minor nit is that we usually use 4-space rather than 2-space indent in the java sources. You might want to check the /** .. */ comments in a few of the files as they are misaligned in a few places. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28010#issuecomment-3562626766 PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2549470742 From alanb at openjdk.org Fri Nov 21 12:04:29 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 21 Nov 2025 12:04:29 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 19:55:24 GMT, Mat Carter wrote: >> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. >> >> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE >> >> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: >> >> TRUE >> FALSE >> >> Passes tier1 on linux (x64) and windows (x64) > > Mat Carter has updated the pull request incrementally with one additional commit since the last revision: > > Remove single whitespace test/hotspot/jtreg/runtime/cds/appcds/aotCache/HotSpotAOTCacheMXBeanTest.java line 31: > 29: * @requires vm.cds.write.archived.java.heap > 30: * @library /test/jdk/lib/testlibrary /test/lib > 31: * /test/hotspot/jtreg/runtime/cds/appcds/aotCache/test-classes Is test-classes used? test/hotspot/jtreg/runtime/cds/appcds/aotCache/HotSpotAOTCacheMXBeanTest.java line 45: > 43: import jdk.test.lib.process.OutputAnalyzer; > 44: > 45: public class HotSpotAOTCacheMXBeanTest { In addition to testing the endRecording operation, we also need to test both the direct and indirect access to the MXBean. The test currently obtains a proxy with newPlatformMXBeanProxy (good) but the direct access with ManagementFactory.getPlatformMXBean is not tested right now. Another thing to test is that getObjectName returns the expected object name. test/hotspot/jtreg/runtime/cds/appcds/aotCache/HotSpotAOTCacheMXBeanTest.java line 92: > 90: } > 91: out.shouldNotContain("HotSpotAOTCacheMXBean is not available"); > 92: out.shouldNotContain("IOException occurred!"); Have you considering have the child terminate with an exception so the exit status is non-0. If you get a success status then the output should have the expected strings. A non-0 would failure, and the test fails. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2549496991 PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2549513596 PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2549518555 From VicWang at zhaoxin.com Fri Nov 21 12:17:57 2025 From: VicWang at zhaoxin.com (Vic Wang(BJ-RD)) Date: Fri, 21 Nov 2025 12:17:57 +0000 Subject: =?utf-8?B?5Zue5aSNOiDlm57lpI06IOWbnuWkjTogSW1wcm92ZSBVc2VBVlggc2V0dGlu?= =?utf-8?B?ZyBhbmQgYWRkIGNwdSBkZXNjcmlwdGlvbnMgZm9yIHpoYW94aW4gcHJvY2Vz?= =?utf-8?Q?sors.?= In-Reply-To: <40e7e92c-2963-4952-85b9-70a9a9dcf60c@oracle.com> References: <58756de5-86a3-4963-a3ae-105455d1fbd0@oracle.com> <8e3eab4683d34791955eac618775cea3@zhaoxin.com> <40e7e92c-2963-4952-85b9-70a9a9dcf60c@oracle.com> Message-ID: Dear David, My company has previously signed the OCA. At that time, it was not required to provide GitHub usernames, so none were supplied. Could you please help me add the GitHub username to my existing Corporate OCA record? Or how to add GitHub username information? GitHub username: Double-Minds-JV Company name: Shanghai Zhaoxin Semiconductor Co., Ltd. Thank you very much for your assistance. Vic Wang >-----????----- >???: David Holmes >????: 2025?9?12? 13:45 >???: Vic Wang(BJ-RD) ; hotspot-dev at openjdk.org >??: Re: ??: ??: Improve UseAVX setting and add cpu descriptions for zhaoxin processors. > > > >[??????????? ????] > >On 12/09/2025 3:00 pm, Vic Wang(BJ-RD) wrote: >> Hi David, >> >> So what do I need to do next? Or just wait for the verification process? > >Just have to wait I'm afraid. > >David >----- > >> Best Regards! >> Vic Wang >> >> >>> -----????----- >>> ???: David Holmes >>> ????: 2025?9?12? 11:47 >>> ???: Vic Wang(BJ-RD) ; hotspot-dev at openjdk.org >>> ??: Re: ??: Improve UseAVX setting and add cpu descriptions for zhaoxin processors. >>> >>> >>> On 12/09/2025 1:19 pm, Vic Wang(BJ-RD) wrote: >>>> Hi David, >>>> >>>> I have submitted a PR, please help to review: >>>> https://github.com/openjdk/jdk/pull/27241 >>>> >>>> It shows "OCA signatory status must be verified". >>>> My employer, Shanghai Zhaoxin Semiconductor, has already signed the OCA, and it can be found on http://oca.opensource.oracle.com. >>>> Please help me with this issue. >>> >>> This is your first submission via Github so it has to go through the verification process. >>> >>> David >>> >>>> Best Regards! >>>> Vic Wang >>>> >>>> >>>>> -----????----- >>>>> ???: David Holmes >>>>> ????: 2025?9?12? 9:06 >>>>> ???: Vic Wang(BJ-RD) ; hotspot-dev at openjdk.org >>>>> ??: Re: Improve UseAVX setting and add cpu descriptions for zhaoxin processors. >>>>> >>>>> >>>>> Hi Vic, >>>>> >>>>> It has been a long time since we heard from you. :) >>>>> >>>>> I have filed a JBS issue for this update: >>>>> >>>>> https://bugs.openjdk.org/browse/JDK-8367478 >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>> On 11/09/2025 11:33 pm, Vic Wang(BJ-RD) wrote: >>>>>> Hi All >>>>>> >>>>>> Here is the patch that improving the UseAVX setting and add cpu descriptions for zhaoxin processors. >>>>>> Can you help to review the patch and assign a number for a pull request? >>>>>> Thank you! >>>>>> >>>>>> >>>>>> Patch in my fork repository: >>>>>> https://github.com/Double-Minds-JV/jdk/commit/06498b42ed54021b3ed1 >>>>>> 4a >>>>>> 9c >>>>>> cc9adf52e9360c9c >>>>>> >>>>>>> diff --git a/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>> b/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>> index 094ab370190..4043b29f18c 100644 >>>>>>> --- a/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>> +++ b/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>> @@ -931,9 +931,17 @@ void VM_Version::get_processor_features() { >>>>>>> if (UseSSE < 1) >>>>>>> _features.clear_feature(CPU_SSE); >>>>>>> >>>>>>> - //since AVX instructions is slower than SSE in some ZX cpus, force USEAVX=0. >>>>>>> - if (is_zx() && ((cpu_family() == 6) || (cpu_family() == 7))) { >>>>>>> - UseAVX = 0; >>>>>>> + // ZX cpus specific settings >>>>>>> + if (is_zx() && FLAG_IS_DEFAULT(UseAVX)) { >>>>>>> + if (cpu_family() == 7) { >>>>>>> + if (extended_cpu_model() == 0x5B || extended_cpu_model() == 0x6B) { >>>>>>> + UseAVX = 1; >>>>>>> + } else if (extended_cpu_model() == 0x1B || extended_cpu_model() == 0x3B) { >>>>>>> + UseAVX = 0; >>>>>>> + } >>>>>>> + } else if (cpu_family() == 6) { >>>>>>> + UseAVX = 0; >>>>>>> + } >>>>>>> } >>>>>>> >>>>>>> // UseSSE is set to the smaller of what hardware supports >>>>>>> and what @@ -2592,6 +2600,7 @@ void >>>>>>> VM_Version::resolve_cpu_information_details(void) { >>>>>>> >>>>>>> const char* VM_Version::cpu_family_description(void) { >>>>>>> int cpu_family_id = extended_cpu_family(); >>>>>>> + int cpu_model_id = extended_cpu_model(); >>>>>>> if (is_amd()) { >>>>>>> if (cpu_family_id < ExtendedFamilyIdLength_AMD) { >>>>>>> return _family_id_amd[cpu_family_id]; @@ -2605,6 >>>>>>> +2614,22 @@ const char* VM_Version::cpu_family_description(void) { >>>>>>> return _family_id_intel[cpu_family_id]; >>>>>>> } >>>>>>> } >>>>>>> + if (is_zx()) { >>>>>>> + if (cpu_family_id == 7) { >>>>>>> + switch (cpu_model_id) { >>>>>>> + case 0x1B: >>>>>>> + return "wudaokou"; >>>>>>> + case 0x3B: >>>>>>> + return "lujiazui"; >>>>>>> + case 0x5B: >>>>>>> + return "yongfeng"; >>>>>>> + case 0x6B: >>>>>>> + return "shijidadao"; >>>>>>> + } >>>>>>> + } else if (cpu_family_id == 6) { >>>>>>> + return "zhangjiang"; >>>>>>> + } >>>>>>> + } >>>>>>> if (is_hygon()) { >>>>>>> return "Dhyana"; >>>>>>> } >>>>>>> @@ -2624,6 +2649,9 @@ int VM_Version::cpu_type_description(char* const buf, size_t buf_len) { >>>>>>> } else if (is_amd()) { >>>>>>> cpu_type = "AMD"; >>>>>>> x64 = cpu_is_em64t() ? " AMD64" : ""; >>>>>>> + } else if (is_zx()) { >>>>>>> + cpu_type = "Zhaoxin"; >>>>>>> + x64 = cpu_is_em64t() ? " x86_64" : ""; >>>>>>> } else if (is_hygon()) { >>>>>>> cpu_type = "Hygon"; >>>>>>> x64 = cpu_is_em64t() ? " AMD64" : ""; @@ -3236,6 +3264,12 >>>>>>> @@ int VM_Version::allocate_prefetch_distance(bool use_watermark_prefetch) { >>>>>>> } else { >>>>>>> return 128; // Athlon >>>>>>> } >>>>>>> + } else if (is_zx()) { >>>>>>> + if (supports_sse2()) { >>>>>>> + return 256; >>>>>>> + } else { >>>>>>> + return 128; >>>>>>> + } >>>>>>> } else { // Intel >>>>>>> if (supports_sse3() && is_intel_server_family()) { >>>>>>> if (supports_sse4_2() && supports_ht()) { // Nehalem >>>>>>> based cpus >>>>>> >>>>>> I have run the jtreg tests after applying the patch, the test-summary shows follows: >>>>>>> ============================== >>>>>>> Test summary >>>>>>> ============================== >>>>>>> TEST TOTAL PASS FAIL ERROR SKIP >>>>>>> jtreg:test/hotspot/jtreg:tier1 3107 2797 0 0 310 >>>>>>> jtreg:test/jdk:tier1 2513 2472 0 0 41 >>>>>>> jtreg:test/langtools:tier1 4668 4656 0 0 12 >>>>>>> jtreg:test/jaxp:tier1 0 0 0 0 0 >>>>>>> jtreg:test/lib-test:tier1 38 38 0 0 0 >>>>>>> ============================== >>>>>>> TEST SUCCESS >>>>>> >>>>>> Best Regards! >>>>>> Vic Wang >>>>>> >>>>>> >>>>>> ????? >>>>>> ????????????????????????????????????????????????????? >>>>>> CONFIDENTIAL NOTE: >>>>>> This email contains confidential or legally privileged information and is for the sole use of its intended recipient. Any unauthorized review, use, copying or forwarding of this email or the content of this email is strictly prohibited. >> >> >> >> ????? >> ????????????????????????????????????????????????????? >> CONFIDENTIAL NOTE: >> This email contains confidential or legally privileged information and is for the sole use of its intended recipient. Any unauthorized review, use, copying or forwarding of this email or the content of this email is strictly prohibited. ????? ????????????????????????????????????????????????????? CONFIDENTIAL NOTE: This email contains confidential or legally privileged information and is for the sole use of its intended recipient. Any unauthorized review, use, copying or forwarding of this email or the content of this email is strictly prohibited. From dalibor.topic at oracle.com Fri Nov 21 12:34:48 2025 From: dalibor.topic at oracle.com (Dalibor Topic) Date: Fri, 21 Nov 2025 13:34:48 +0100 Subject: =?UTF-8?B?UmU6IOWbnuWkjTog5Zue5aSNOiDlm57lpI06IEltcHJvdmUgVXNlQVZY?= =?UTF-8?Q?_setting_and_add_cpu_descriptions_for_zhaoxin_processors=2E?= In-Reply-To: References: <58756de5-86a3-4963-a3ae-105455d1fbd0@oracle.com> <8e3eab4683d34791955eac618775cea3@zhaoxin.com> <40e7e92c-2963-4952-85b9-70a9a9dcf60c@oracle.com> Message-ID: <96733067-c6b6-4989-a1be-bf651965459a@oracle.com> On 21/11/2025 13:17, Vic Wang(BJ-RD) wrote: > Dear David, > > My company has previously signed the OCA. At that time, it was not required to provide GitHub usernames, so none were supplied. > Could you please help me add the GitHub username to my existing Corporate OCA record? Or how to add GitHub username information? > > GitHub username: Double-Minds-JV > Company name: Shanghai Zhaoxin Semiconductor Co., Ltd. Thanks, Vis - verification request sent. cheers, dalibor topic > > Thank you very much for your assistance. > Vic Wang > > >> -----????----- >> ???: David Holmes >> ????: 2025?9?12? 13:45 >> ???: Vic Wang(BJ-RD) ; hotspot-dev at openjdk.org >> ??: Re: ??: ??: Improve UseAVX setting and add cpu descriptions for zhaoxin processors. >> >> >> >> [??????????? ????] >> >> On 12/09/2025 3:00 pm, Vic Wang(BJ-RD) wrote: >>> Hi David, >>> >>> So what do I need to do next? Or just wait for the verification process? >> >> Just have to wait I'm afraid. >> >> David >> ----- >> >>> Best Regards! >>> Vic Wang >>> >>> >>>> -----????----- >>>> ???: David Holmes >>>> ????: 2025?9?12? 11:47 >>>> ???: Vic Wang(BJ-RD) ; hotspot-dev at openjdk.org >>>> ??: Re: ??: Improve UseAVX setting and add cpu descriptions for zhaoxin processors. >>>> >>>> >>>> On 12/09/2025 1:19 pm, Vic Wang(BJ-RD) wrote: >>>>> Hi David, >>>>> >>>>> I have submitted a PR, please help to review: >>>>> https://github.com/openjdk/jdk/pull/27241 >>>>> >>>>> It shows "OCA signatory status must be verified". >>>>> My employer, Shanghai Zhaoxin Semiconductor, has already signed the OCA, and it can be found on http://oca.opensource.oracle.com. >>>>> Please help me with this issue. >>>> >>>> This is your first submission via Github so it has to go through the verification process. >>>> >>>> David >>>> >>>>> Best Regards! >>>>> Vic Wang >>>>> >>>>> >>>>>> -----????----- >>>>>> ???: David Holmes >>>>>> ????: 2025?9?12? 9:06 >>>>>> ???: Vic Wang(BJ-RD) ; hotspot-dev at openjdk.org >>>>>> ??: Re: Improve UseAVX setting and add cpu descriptions for zhaoxin processors. >>>>>> >>>>>> >>>>>> Hi Vic, >>>>>> >>>>>> It has been a long time since we heard from you. :) >>>>>> >>>>>> I have filed a JBS issue for this update: >>>>>> >>>>>> https://bugs.openjdk.org/browse/JDK-8367478 >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>> On 11/09/2025 11:33 pm, Vic Wang(BJ-RD) wrote: >>>>>>> Hi All >>>>>>> >>>>>>> Here is the patch that improving the UseAVX setting and add cpu descriptions for zhaoxin processors. >>>>>>> Can you help to review the patch and assign a number for a pull request? >>>>>>> Thank you! >>>>>>> >>>>>>> >>>>>>> Patch in my fork repository: >>>>>>> https://github.com/Double-Minds-JV/jdk/commit/06498b42ed54021b3ed1 >>>>>>> 4a >>>>>>> 9c >>>>>>> cc9adf52e9360c9c >>>>>>> >>>>>>>> diff --git a/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>>> b/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>>> index 094ab370190..4043b29f18c 100644 >>>>>>>> --- a/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>>> +++ b/src/hotspot/cpu/x86/vm_version_x86.cpp >>>>>>>> @@ -931,9 +931,17 @@ void VM_Version::get_processor_features() { >>>>>>>> if (UseSSE < 1) >>>>>>>> _features.clear_feature(CPU_SSE); >>>>>>>> >>>>>>>> - //since AVX instructions is slower than SSE in some ZX cpus, force USEAVX=0. >>>>>>>> - if (is_zx() && ((cpu_family() == 6) || (cpu_family() == 7))) { >>>>>>>> - UseAVX = 0; >>>>>>>> + // ZX cpus specific settings >>>>>>>> + if (is_zx() && FLAG_IS_DEFAULT(UseAVX)) { >>>>>>>> + if (cpu_family() == 7) { >>>>>>>> + if (extended_cpu_model() == 0x5B || extended_cpu_model() == 0x6B) { >>>>>>>> + UseAVX = 1; >>>>>>>> + } else if (extended_cpu_model() == 0x1B || extended_cpu_model() == 0x3B) { >>>>>>>> + UseAVX = 0; >>>>>>>> + } >>>>>>>> + } else if (cpu_family() == 6) { >>>>>>>> + UseAVX = 0; >>>>>>>> + } >>>>>>>> } >>>>>>>> >>>>>>>> // UseSSE is set to the smaller of what hardware supports >>>>>>>> and what @@ -2592,6 +2600,7 @@ void >>>>>>>> VM_Version::resolve_cpu_information_details(void) { >>>>>>>> >>>>>>>> const char* VM_Version::cpu_family_description(void) { >>>>>>>> int cpu_family_id = extended_cpu_family(); >>>>>>>> + int cpu_model_id = extended_cpu_model(); >>>>>>>> if (is_amd()) { >>>>>>>> if (cpu_family_id < ExtendedFamilyIdLength_AMD) { >>>>>>>> return _family_id_amd[cpu_family_id]; @@ -2605,6 >>>>>>>> +2614,22 @@ const char* VM_Version::cpu_family_description(void) { >>>>>>>> return _family_id_intel[cpu_family_id]; >>>>>>>> } >>>>>>>> } >>>>>>>> + if (is_zx()) { >>>>>>>> + if (cpu_family_id == 7) { >>>>>>>> + switch (cpu_model_id) { >>>>>>>> + case 0x1B: >>>>>>>> + return "wudaokou"; >>>>>>>> + case 0x3B: >>>>>>>> + return "lujiazui"; >>>>>>>> + case 0x5B: >>>>>>>> + return "yongfeng"; >>>>>>>> + case 0x6B: >>>>>>>> + return "shijidadao"; >>>>>>>> + } >>>>>>>> + } else if (cpu_family_id == 6) { >>>>>>>> + return "zhangjiang"; >>>>>>>> + } >>>>>>>> + } >>>>>>>> if (is_hygon()) { >>>>>>>> return "Dhyana"; >>>>>>>> } >>>>>>>> @@ -2624,6 +2649,9 @@ int VM_Version::cpu_type_description(char* const buf, size_t buf_len) { >>>>>>>> } else if (is_amd()) { >>>>>>>> cpu_type = "AMD"; >>>>>>>> x64 = cpu_is_em64t() ? " AMD64" : ""; >>>>>>>> + } else if (is_zx()) { >>>>>>>> + cpu_type = "Zhaoxin"; >>>>>>>> + x64 = cpu_is_em64t() ? " x86_64" : ""; >>>>>>>> } else if (is_hygon()) { >>>>>>>> cpu_type = "Hygon"; >>>>>>>> x64 = cpu_is_em64t() ? " AMD64" : ""; @@ -3236,6 +3264,12 >>>>>>>> @@ int VM_Version::allocate_prefetch_distance(bool use_watermark_prefetch) { >>>>>>>> } else { >>>>>>>> return 128; // Athlon >>>>>>>> } >>>>>>>> + } else if (is_zx()) { >>>>>>>> + if (supports_sse2()) { >>>>>>>> + return 256; >>>>>>>> + } else { >>>>>>>> + return 128; >>>>>>>> + } >>>>>>>> } else { // Intel >>>>>>>> if (supports_sse3() && is_intel_server_family()) { >>>>>>>> if (supports_sse4_2() && supports_ht()) { // Nehalem >>>>>>>> based cpus >>>>>>> >>>>>>> I have run the jtreg tests after applying the patch, the test-summary shows follows: >>>>>>>> ============================== >>>>>>>> Test summary >>>>>>>> ============================== >>>>>>>> TEST TOTAL PASS FAIL ERROR SKIP >>>>>>>> jtreg:test/hotspot/jtreg:tier1 3107 2797 0 0 310 >>>>>>>> jtreg:test/jdk:tier1 2513 2472 0 0 41 >>>>>>>> jtreg:test/langtools:tier1 4668 4656 0 0 12 >>>>>>>> jtreg:test/jaxp:tier1 0 0 0 0 0 >>>>>>>> jtreg:test/lib-test:tier1 38 38 0 0 0 >>>>>>>> ============================== >>>>>>>> TEST SUCCESS >>>>>>> >>>>>>> Best Regards! >>>>>>> Vic Wang >>>>>>> >>>>>>> >>>>>>> ????? >>>>>>> ????????????????????????????????????????????????????? >>>>>>> CONFIDENTIAL NOTE: >>>>>>> This email contains confidential or legally privileged information and is for the sole use of its intended recipient. Any unauthorized review, use, copying or forwarding of this email or the content of this email is strictly prohibited. >>> >>> >>> >>> ????? >>> ????????????????????????????????????????????????????? >>> CONFIDENTIAL NOTE: >>> This email contains confidential or legally privileged information and is for the sole use of its intended recipient. Any unauthorized review, use, copying or forwarding of this email or the content of this email is strictly prohibited. > > > ????? > ????????????????????????????????????????????????????? > CONFIDENTIAL NOTE: > This email contains confidential or legally privileged information and is for the sole use of its intended recipient. Any unauthorized review, use, copying or forwarding of this email or the content of this email is strictly prohibited. -- Dalibor Topic Consulting Product Manager Phone: +494089091214 , Mobile: +491737185961 Oracle Global Services Germany GmbH Hauptverwaltung: Riesstr. 25, D-80992 M?nchen Registergericht: Amtsgericht M?nchen, HRB 246209 Gesch?ftsf?hrer: Ralf Herrmann From shade at openjdk.org Fri Nov 21 12:35:24 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 12:35:24 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 10:04:56 GMT, Thomas Schatzl wrote: > I assume that the jmh writebarrier micros were run just in case. As expected, I see no real impact on EPYC machine, as we realistically only touch gc-active and/or slow-paths: Benchmark Mode Cnt Score Error Units # ----- Baseline WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2074.042 ? 33.941 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 31.908 ? 0.020 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2052.188 ? 2.993 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 31.923 ? 0.127 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2648.758 ? 12.689 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 41.843 ? 6.851 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 1860.052 ? 41.707 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 29.635 ? 0.026 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2647.011 ? 3.035 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 40.217 ? 0.053 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 1838.099 ? 11.536 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 29.637 ? 0.031 ns/op WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPath avgt 12 1.694 ? 0.001 ns/op WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.709 ? 0.001 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2245.868 ? 1.523 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 36.056 ? 0.008 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2247.127 ? 7.293 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 36.046 ? 0.012 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2812.237 ? 32.421 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 44.899 ? 0.258 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 2251.210 ? 18.101 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 36.018 ? 0.011 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2821.869 ? 32.633 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 44.800 ? 0.018 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 2247.837 ? 14.136 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 36.021 ? 0.015 ns/op WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPath avgt 12 1.694 ? 0.001 ns/op WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.710 ? 0.001 ns/op # ----- Patched WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2058.748 ? 11.193 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 31.943 ? 0.031 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2052.097 ? 1.134 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 31.927 ? 0.021 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2661.495 ? 36.916 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 40.327 ? 0.463 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 1841.228 ? 7.491 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 29.644 ? 0.021 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2671.222 ? 45.797 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 40.214 ? 0.073 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 1833.984 ? 9.946 ns/op WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 29.635 ? 0.070 ns/op WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPath avgt 12 1.694 ? 0.001 ns/op WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.710 ? 0.001 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2244.271 ? 37.550 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 36.044 ? 0.006 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2245.466 ? 18.204 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 36.036 ? 0.009 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2811.951 ? 26.061 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 44.692 ? 0.041 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 2241.369 ? 0.614 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 36.019 ? 0.014 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2827.016 ? 43.966 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 44.700 ? 0.060 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 2242.395 ? 5.700 ns/op WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 36.018 ? 0.010 ns/op WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPath avgt 12 1.693 ? 0.001 ns/op WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.710 ? 0.001 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/28446#issuecomment-3562839053 From eastigeevich at openjdk.org Fri Nov 21 13:10:55 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 21 Nov 2025 13:10:55 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 15:22:23 GMT, Axel Boldt-Christmas wrote: >> Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. >> >> Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: >> - Disable coherent icache. >> - Trap IC IVAU instructions. >> - Execute: >> - `tlbi vae3is, xzr` >> - `dsb sy` >> >> `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. >> >> As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: >> >> "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." >> >> This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. >> >> Changes include: >> >> * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. >> * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. >> * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. >> * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. >> >> Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) >> >> - Baseline >> >> $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1... > > I think the implementation is fine. We can always extend it later if we find that other platforms or hardware needs this sort of treatment. > > My knowledge and experience with arm hardware implementation specifics are rather lacking. So I cannot comment on the validity of the assertions made here w.r.t. only invalidating the first instruction in the nmethod etc. > > Hopefully some of our resident arm experts can chime in. @xmas92 > The added microbenchmark shows interesting regressions when an nmethod has no accesses to object's fields: > > ``` > Benchmark Score Error Units > GCPatchingNmethodCost.fullGC:base 73.937 ? 17.764 ms/op > GCPatchingNmethodCost.systemGC:base 77.495 ? 11.963 ms/op > GCPatchingNmethodCost.youngGC:base 9.955 ? 1.649 ms/op > GCPatchingNmethodCost.fullGC:fix 88.865 ? 19.299 ms/op +20.1% > GCPatchingNmethodCost.systemGC:fix 90.572 ? 14.750 ms/op +16.9% > GCPatchingNmethodCost.youngGC:fix 10.219 ? 0.877 ms/op +2.7% > ``` I think I might have an idea what causes the regressions. I'll be debugging it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3562960050 From coleenp at openjdk.org Fri Nov 21 14:09:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 21 Nov 2025 14:09:55 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v3] In-Reply-To: <_C1_-yzeixcKbR2NfmnM4MEl3InsR6cTTzmoT-vMSBY=.032aae46-e951-4c76-91e6-fc7a8fe8b73c@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> <_C1_-yzeixcKbR2NfmnM4MEl3InsR6cTTzmoT-vMSBY=.032aae46-e951-4c76-91e6-fc7a8fe8b73c@github.com> Message-ID: <_1-urAH6nhGRC5fXZnBvC60QvUAA7KA3ekz5sRD9MpQ=.edd7e8df-5459-4b89-a02d-5da88ce76c59@github.com> On Thu, 20 Nov 2025 23:33:30 GMT, Vladimir Ivanov wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert a couple more InstanceKlass::casts also to get GHA to restart. > > src/hotspot/share/opto/compile.cpp line 1729: > >> 1727: if (flat->offset() == in_bytes(Klass::super_check_offset_offset())) >> 1728: alias_type(idx)->set_rewritable(false); >> 1729: if (flat->isa_instklassptr() && flat->offset() == in_bytes(InstanceKlass::access_flags_offset())) > > I'd place the check separately. Otherwise, looks good. > > diff --git a/src/hotspot/share/opto/compile.cpp b/src/hotspot/share/opto/compile.cpp > index 6babc13e1b3..9215c0fc03f 100644 > --- a/src/hotspot/share/opto/compile.cpp > +++ b/src/hotspot/share/opto/compile.cpp > @@ -1726,8 +1726,6 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr > } > if (flat->offset() == in_bytes(Klass::super_check_offset_offset())) > alias_type(idx)->set_rewritable(false); > - if (flat->offset() == in_bytes(Klass::access_flags_offset())) > - alias_type(idx)->set_rewritable(false); > if (flat->offset() == in_bytes(Klass::misc_flags_offset())) > alias_type(idx)->set_rewritable(false); > if (flat->offset() == in_bytes(Klass::java_mirror_offset())) > @@ -1735,6 +1733,11 @@ Compile::AliasType* Compile::find_alias_type(const TypePtr* adr_type, bool no_cr > if (flat->offset() == in_bytes(Klass::secondary_super_cache_offset())) > alias_type(idx)->set_rewritable(false); > } > + if (flat->isa_instklassptr()) { > + if (flat->offset() == in_bytes(InstanceKlass::access_flags_offset())) { > + alias_type(idx)->set_rewritable(false); > + } > + } > // %%% (We would like to finalize JavaThread::threadObj_offset(), > // but the base pointer type is not distinctive enough to identify > // references into JavaThread.) Yes that looks better. There aren't enough {} in that bit of code but I won't add more to existing code. Thanks for your help with the C2 code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2549873241 From ayang at openjdk.org Fri Nov 21 14:23:15 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 21 Nov 2025 14:23:15 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: References: Message-ID: <6GM7vtD34m6llQY1qgeXuLxEonBmLaQWIBJYRgu-dzk=.bcf2a835-1dab-4108-8d74-1779b947990e@github.com> On Fri, 21 Nov 2025 10:07:35 GMT, Aleksey Shipilev wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [ ] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Make some backward branches explicitly short Marked as reviewed by ayang (Reviewer). src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 208: > 206: > 207: // Jump out if done, or fall-through to runtime. > 208: // "Done" is far away, so jump cannot be short. I believe "Done" refers to `L_done`, so I wonder if we use that directly. ------------- PR Review: https://git.openjdk.org/jdk/pull/28446#pullrequestreview-3492959775 PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549912395 From coleenp at openjdk.org Fri Nov 21 14:53:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 21 Nov 2025 14:53:03 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: > ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. > Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). > Tested with tier1-4. 5-7 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Reformatting compile.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28371/files - new: https://git.openjdk.org/jdk/pull/28371/files/1060463b..06d6a186 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28371&range=02-03 Stats: 8 lines in 1 file changed: 6 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28371/head:pull/28371 PR: https://git.openjdk.org/jdk/pull/28371 From mbaesken at openjdk.org Fri Nov 21 15:28:24 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 21 Nov 2025 15:28:24 GMT Subject: RFR: 8372348: Adjust some UL / JFR string deduplication output messages Message-ID: <7-XRp229KFw3V2bFPMnaWaoPdF3ZCYVNuViEo2O7eUI=.e0247347-188b-44bf-935c-6b0186026fd3@github.com> There is some UL output in the string deduplication code that is not very clear and has room for improvement. The inspected strings number should be shown and the new unknown strings get a changed text. (also the new JFR strip dedup event description is slightly adjusted) ------------- Commit messages: - JDK-8372348 Changes: https://git.openjdk.org/jdk/pull/28455/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28455&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372348 Stats: 13 lines in 2 files changed: 2 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/28455.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28455/head:pull/28455 PR: https://git.openjdk.org/jdk/pull/28455 From shade at openjdk.org Fri Nov 21 16:09:20 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 16:09:20 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v3] In-Reply-To: <6GM7vtD34m6llQY1qgeXuLxEonBmLaQWIBJYRgu-dzk=.bcf2a835-1dab-4108-8d74-1779b947990e@github.com> References: <6GM7vtD34m6llQY1qgeXuLxEonBmLaQWIBJYRgu-dzk=.bcf2a835-1dab-4108-8d74-1779b947990e@github.com> Message-ID: On Fri, 21 Nov 2025 14:20:06 GMT, Albert Mingkun Yang wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Make some backward branches explicitly short > > src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 208: > >> 206: >> 207: // Jump out if done, or fall-through to runtime. >> 208: // "Done" is far away, so jump cannot be short. > > I believe "Done" refers to `L_done`, so I wonder if we use that directly. Yes, is it about `L_done`. Fixed the comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2550250301 From shade at openjdk.org Fri Nov 21 16:09:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 21 Nov 2025 16:09:17 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v4] In-Reply-To: References: Message-ID: > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Adjust label name - Merge branch 'master' into JDK-8372285-g1-barrier-micro - Make some backward branches explicitly short - Comment - Shorten a few more branches - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short - More touchups - Also optimize queue insertion - Touchups - WIP ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28446/files - new: https://git.openjdk.org/jdk/pull/28446/files/1f57d0d9..c23bac46 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=02-03 Stats: 1409 lines in 83 files changed: 622 ins; 421 del; 366 mod Patch: https://git.openjdk.org/jdk/pull/28446.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28446/head:pull/28446 PR: https://git.openjdk.org/jdk/pull/28446 From aph at openjdk.org Fri Nov 21 16:20:58 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 21 Nov 2025 16:20:58 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v7] In-Reply-To: References: Message-ID: On Fri, 17 Oct 2025 04:19:42 GMT, Patrick Zhang wrote: >> Issue: >> In AArch64 port, `UseBlockZeroing` is by default set to true and `BlockZeroingLowLimit` is initialized to 256. If `DC ZVA` is supported, `BlockZeroingLowLimit` is later updated to `4 * VM_Version::zva_length()`. When `UseBlockZeroing` is set to false, all related conditional checks should ignore `BlockZeroingLowLimit`. However, the function `MacroAssembler::zero_words(Register base, uint64_t cnt)` still evaluates the lower limit and bases its code generation logic on it, which seems to be an incomplete conditional check. >> >> This PR: >> 1. Reset `BlockZeroingLowLimit` to `4 * VM_Version::zva_length()` or 256 with a warning message if it was manually configured from the default while `UseBlockZeroing` is disabled. >> 2. Added necessary comments in `MacroAssembler::zero_words(Register base, uint64_t cnt)` and `MacroAssembler::zero_words(Register ptr, Register cnt)` to explain why we do not check `UseBlockZeroing` in the outer part of these functions. Instead, the decision is delegated to the stub function `zero_blocks`, which encapsulates the DC ZVA instructions and serves as the inner implementation of `zero_words`. This approach helps better control the increase in code cache size during array or object instance initialization. >> 3. Added more testing sizes to `test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java` to better cover scenarios involving smaller arrays and objects.. >> >> Tests: >> 1. Performance tests on the bundled JMH `vm.compiler.ClearMemory`, and `vm.gc.RawAllocationRate` (including `arrayTest` and `instanceTest`) showed no obvious regression. Negative tests with `jdk/bin/java -jar images/test/micro/benchmarks.jar RawAllocationRate.arrayTest_C1 -bm thrpt -gc false -wi 0 -w 30 -i 1 -r 30 -t 1 -f 1 -tu s -jvmArgs "-XX:-UseBlockZeroing -XX:BlockZeroingLowLimit=8" -p size=32` demonstrated good wall times on `zero_words_reg_imm` calls, as expected. >> 2. Jtreg ter1 test on Ampere Altra, AmpereOne, Graviton2 and 3, tier2 on Altra. No new issues found. Passed tests of GHA Sanity Checks. > > Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: > > Refine the count types to pass mac and win builds > > Signed-off-by: Patrick Zhang src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6207: > 6205: assert(ptr == r10 && cnt == r11, "mismatch in register usage"); > 6206: RuntimeAddress zero_blocks = RuntimeAddress(StubRoutines::aarch64::zero_blocks()); > 6207: assert(zero_blocks.target() != nullptr, "zero_blocks stub has not been generated"); What is the point of this change? src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6236: > 6234: // The zero_blocks routine has already performed the necessary > 6235: // adjustments to r10 and r11, ensuring they are correctly set > 6236: // for subsequent processing. Suggestion: // A few words remain. zero_blocks() has adjusted r10 so that it // points to the remaining words and adjusted the count in r11. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6268: > 6266: // and necessary unrolled str/stp expanding when the condition is not met. > 6267: // This approach also helps prevent sudden increases in code cache size > 6268: // when zeroing large memory areas in many places. Suggestion: // There is no need to check UseBlockZeroing here because that is // delegated to the zero_blocks stub. The code here is inlined, so // it is important to keep it small. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6273: > 6271: result = zero_words(r10, r11); > 6272: } else { > 6273: #ifndef PRODUCT What is this change for? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2550284013 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2550286280 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2550291161 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2550292504 From aph at openjdk.org Fri Nov 21 16:26:16 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 21 Nov 2025 16:26:16 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v7] In-Reply-To: References: Message-ID: <17Am3maOIHe82gwhbVA8pSeWT5v9U8v-7d_VYHurskk=.1e51f71c-0640-4408-a96a-97ae8a5904e5@github.com> On Fri, 17 Oct 2025 04:19:42 GMT, Patrick Zhang wrote: >> Issue: >> In AArch64 port, `UseBlockZeroing` is by default set to true and `BlockZeroingLowLimit` is initialized to 256. If `DC ZVA` is supported, `BlockZeroingLowLimit` is later updated to `4 * VM_Version::zva_length()`. When `UseBlockZeroing` is set to false, all related conditional checks should ignore `BlockZeroingLowLimit`. However, the function `MacroAssembler::zero_words(Register base, uint64_t cnt)` still evaluates the lower limit and bases its code generation logic on it, which seems to be an incomplete conditional check. >> >> This PR: >> 1. Reset `BlockZeroingLowLimit` to `4 * VM_Version::zva_length()` or 256 with a warning message if it was manually configured from the default while `UseBlockZeroing` is disabled. >> 2. Added necessary comments in `MacroAssembler::zero_words(Register base, uint64_t cnt)` and `MacroAssembler::zero_words(Register ptr, Register cnt)` to explain why we do not check `UseBlockZeroing` in the outer part of these functions. Instead, the decision is delegated to the stub function `zero_blocks`, which encapsulates the DC ZVA instructions and serves as the inner implementation of `zero_words`. This approach helps better control the increase in code cache size during array or object instance initialization. >> 3. Added more testing sizes to `test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java` to better cover scenarios involving smaller arrays and objects.. >> >> Tests: >> 1. Performance tests on the bundled JMH `vm.compiler.ClearMemory`, and `vm.gc.RawAllocationRate` (including `arrayTest` and `instanceTest`) showed no obvious regression. Negative tests with `jdk/bin/java -jar images/test/micro/benchmarks.jar RawAllocationRate.arrayTest_C1 -bm thrpt -gc false -wi 0 -w 30 -i 1 -r 30 -t 1 -f 1 -tu s -jvmArgs "-XX:-UseBlockZeroing -XX:BlockZeroingLowLimit=8" -p size=32` demonstrated good wall times on `zero_words_reg_imm` calls, as expected. >> 2. Jtreg ter1 test on Ampere Altra, AmpereOne, Graviton2 and 3, tier2 on Altra. No new issues found. Passed tests of GHA Sanity Checks. > > Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: > > Refine the count types to pass mac and win builds > > Signed-off-by: Patrick Zhang src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6281: > 6279: #endif > 6280: // Use 16 words as the block size which is 128 bytes on 64-bit systems. > 6281: // A complete loop body will be 8 STPs unrolled there. Suggestion: // Use 16 words (128 bytes) as the block size. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6282: > 6280: // Use 16 words as the block size which is 128 bytes on 64-bit systems. > 6281: // A complete loop body will be 8 STPs unrolled there. > 6282: const int block_size = 16; Naming this constant `block_size` only adds to any confusion, IMO. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 694: > 692: // Process words with length exceeding the predefined > 693: // block size threshold. The loop body will be unrolled based on > 694: // the number of STPs calculated below. Suggestion: // Process any remaining blocks not handled by the stub. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2550303883 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2550306382 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2550310021 From vpaprotski at openjdk.org Fri Nov 21 17:17:53 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Fri, 21 Nov 2025 17:17:53 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v2] In-Reply-To: References: <-NP71XXG0bisxVHds8O-uXhLZqbnVLijJoJDwVq2ZBk=.2478c442-fc34-4ba0-9811-1f910ee3ee36@github.com> Message-ID: <0zIQmkXqAv1UktDyJ4wh83qqB7FGS9bM80Z3562IuHs=.f1499d7d-b025-4cf4-b7a9-b9436d0f9ab3@github.com> On Thu, 20 Nov 2025 23:39:05 GMT, Vladimir Ivanov wrote: >>> I understand your reasons. The question is whether you'll need the microbenchmark in the future. If no (or probably no), please remove the micro. If needed, please move it from the "org.openjdk.bench.javax.crypto.full" package to "org.openjdk.bench.javax.crypto". It is supposed to have only public API micros in packages "small" and "full" >> >> @kuksenko I decided to just remove it. If anyone wants it back, its in my git history (I usually keep my branches after merge..) > >> If anyone wants it back, its in my git history (I usually keep my branches after merge..) > > You could put a comment with the link into JBS issue to make it easier to discover later. (Or just attach the source file there.) @iwanowww thanks for the suggestion! attached to JBS. @mcpowers would you mind running your internal test suite for this PR? I am thinking of integrating early next week, if no objections; getting close to the release deadline, dont want to cut it even closer.. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3563948572 From ayang at openjdk.org Fri Nov 21 18:08:01 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 21 Nov 2025 18:08:01 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v4] In-Reply-To: References: Message-ID: <5Kq7vQWy6cVryk4TSboKZYFuf8vSZxBFk0sltlmUNvk=.abff54bb-3a68-4775-b369-50d260418faf@github.com> On Fri, 21 Nov 2025 16:09:17 GMT, Aleksey Shipilev wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [ ] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Adjust label name > - Merge branch 'master' into JDK-8372285-g1-barrier-micro > - Make some backward branches explicitly short > - Comment > - Shorten a few more branches > - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short > - More touchups > - Also optimize queue insertion > - Touchups > - WIP Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28446#pullrequestreview-3493844075 From kvn at openjdk.org Fri Nov 21 18:33:00 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 21 Nov 2025 18:33:00 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 16:09:17 GMT, Aleksey Shipilev wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [ ] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Adjust label name > - Merge branch 'master' into JDK-8372285-g1-barrier-micro > - Make some backward branches explicitly short > - Comment > - Shorten a few more branches > - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short > - More touchups > - Also optimize queue insertion > - Touchups > - WIP Comments. src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 92: > 90: void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, > 91: Register addr, Register count, Register tmp) { > 92: Label done; Since you are touching this code can you add `L_` to labels in this code? This is our usual practice for labels to clear see them. src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 193: > 191: // Is the previous value null? > 192: __ testptr(pre_val, pre_val); > 193: __ jccb(Assembler::equal, L_null); I know that this short jump will be fused to one instruction with testptr on modern x86. But you will have jump-to-jump sequence. So you may win size wise but "throughput" could be worser. Especially if it is "fast" path. Can you check performance of these changes vs using `jcc(Assembler::equal, L_done);` here. src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 282: > 280: Register thread = r15_thread; > 281: > 282: Label done; Please use `L_done`. ------------- PR Review: https://git.openjdk.org/jdk/pull/28446#pullrequestreview-3493923593 PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2550629465 PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2550627841 PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2550630291 From sspitsyn at openjdk.org Fri Nov 21 19:45:04 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 21 Nov 2025 19:45:04 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v3] In-Reply-To: References: Message-ID: > This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame. > > This fix is to avoid enabling the `interp-only` mode for threads when `FramePop` events are enabled with JVMTI `SetEventNotificationMode`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` in the function `InterpreterRuntime::frequency_counter_overflow_inner()`. Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked. > The other details will be provided in the first PR request comment. > It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed). > > Testing: > - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage > - submitted mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: fix typo in a EATests.java comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28407/files - new: https://git.openjdk.org/jdk/pull/28407/files/b3cffe5a..5989906c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28407.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28407/head:pull/28407 PR: https://git.openjdk.org/jdk/pull/28407 From sspitsyn at openjdk.org Fri Nov 21 19:45:08 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 21 Nov 2025 19:45:08 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v2] In-Reply-To: <-6UQvhvzWxm9r6rtvS4EjiVn-to2xAa64MJi-9_-zss=.fe61f491-31d9-4e75-932a-8e9a9f6a45d1@github.com> References: <-6UQvhvzWxm9r6rtvS4EjiVn-to2xAa64MJi-9_-zss=.fe61f491-31d9-4e75-932a-8e9a9f6a45d1@github.com> Message-ID: On Fri, 21 Nov 2025 07:43:04 GMT, Jean-No?l Rouvignac wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> cleanup: removed an old code fragment in frame.cpp > > test/jdk/com/sun/jdi/EATests.java line 3068: > >> 3066: // frame[4]: EATestsTarget.main(java.lang.String[]) >> 3067: >> 3068: env.stepOverLine(thread); // needed to keep target thread interp-only, so dontinline_brkpt_iret is not inligned > > inligned => inlined Thank you catching it! Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28407#discussion_r2550785711 From dlong at openjdk.org Fri Nov 21 19:57:58 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 21 Nov 2025 19:57:58 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:56:48 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Refine `first_check_size` definitions CI testing results are good. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28192#pullrequestreview-3494168899 From kvn at openjdk.org Fri Nov 21 20:34:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 21 Nov 2025 20:34:54 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. Tobias submitted testing for these changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28241#issuecomment-3564501490 From kvn at openjdk.org Fri Nov 21 20:47:01 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 21 Nov 2025 20:47:01 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. src/hotspot/share/code/nmethod.cpp line 1508: > 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER > 1507: // Direct calls may no longer be in range and the use of a trampoline may now be required. > 1508: // Instead, allow trampoline relocations to update their owners and perform the necessary checks. `Instead` is wrong word here I think. May be `Otherwise`. Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2550915038 From sviswanathan at openjdk.org Fri Nov 21 22:07:08 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 21 Nov 2025 22:07:08 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 01:13:47 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. >> >> To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. >> >> >> ### **Performance comparison for byte array fills in a loop for 1 million times** >> >> >>
    >> >> >> UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] >> -- | -- | -- | -- >> 1 | 0.46 | 0.14 | 0.263 >> 2 | 0.46 | 0.16 | 0.264 >> 5 | 0.46 | 0.29 | 0.30 >> 10 | 0.46 | 0.58 | 0.32 >> 15 | 0.46 | 0.42 | 0.276 >> 16 | 0.46 | 0.46 | 0.32 >> 17 | 0.21 | 0.5 | 0.3 >> 20 | 0.21 | 0.37 | 0.3 >> 25 | 0.21 | 0.59 | 0.288 >> 31 | 0.21 | 0.53 | 0.284 >> 32 | 0.21 | 0.58 | 0.322 >> 35 | 0.5 | 0.77 | 0.29 >> 40 | 0.5 | 0.61 | 0.367 >> 45 | 0.5 | 0.52 | 0.324 >> 48 | 0.5 | 0.66 | 0.368 >> 49 | 0.22 | 0.69 | 0.342 >> 50 | 0.22 | 0.78 | 0.346 >> 55 | 0.22 | 0.67 | 0.3 >> 60 | 0.22 | 0.67 | 0.322 >> 64 | 0.22 | 0.82 | 0.362 >> 70 | 0.51 | 1.1 | 0.32 >> 80 | 0.49 | 0.89 | 0.37 >> 90 | 0.225 | 0.68 | 0.343 >> 100 | 0.54 | 1.09 | 0.41 >> 110 | 0.6 | 0.98 | 0.36 >> 120 | 0.26 | 0.75 | 0.386 >> 128 | 0.266 | 1.1 | 0.402 >> >> >> >>
    > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > undo size check for fill64_masked The pre-submit test seem to be unrelated to the PR changes. A fresh merge with tip might resolve those. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9245: > 9243: } > 9244: > 9245: void MacroAssembler::fill32_unmasked(uint shift, Register dst, int disp, XMMRegister xmm, This could be called as fill32_tail. Also good to replace overall fill32_masked with fill32_tail. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9305: > 9303: } > 9304: > 9305: void MacroAssembler::fill64_unmasked(uint shift, Register dst, int disp, This could be called as fill64_tail. Also good to replace overall fill64_masked with fill64_tail. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9362: > 9360: jcc(Assembler::greater, L_fill_64_bytes); > 9361: fill32_unmasked(shift, to, 0, xtmp, count, rtmp); > 9362: jmp(L_exit); Instead of repeating fill32_unmasked multiple time, you could jmp to say L_fill_32_tail and have the fill32_unmasked code there one time. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9380: > 9378: bind(L_fill_96_bytes); > 9379: cmpq(count, 96 >> shift); > 9380: jcc(Assembler::greater, L_fill_128_bytes); With the suggestion to have fill32_unmasked and fill64_unmasked one time, you may be able to retain the jccb. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9383: > 9381: fill64(to, 0, xtmp); > 9382: subq(count, 64 >> shift); > 9383: fill32_unmasked(shift, to, 64, xtmp, count, rtmp); Instead of repeating fill64_unmasked multiple time, you could jmp to say L_fill_64_tail and have the fill64_unmasked code there one time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28442#issuecomment-3564755769 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551064898 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551066035 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551070541 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551074095 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551071856 From eastigeevich at openjdk.org Fri Nov 21 22:21:52 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 21 Nov 2025 22:21:52 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v2] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Use THREAD_LOCAL deferred_icache_invalidation instead of parameter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/18476044..b60317e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=00-01 Stats: 56 lines in 15 files changed: 17 ins; 13 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From duke at openjdk.org Fri Nov 21 22:25:55 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Fri, 21 Nov 2025 22:25:55 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 20:43:33 GMT, Vladimir Kozlov wrote: >> [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) >> >> This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. >> >> --- >> >> #### 1. Test Bug >> >> It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). >> >> The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. >> >> This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` >> >> >> After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. >> >> --- >> >> #### 2. Implementation Bug >> >> `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. >> >> Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. >> >> The fix ensures that all call sites are patched **before** the `nmethod` is registered. >> >> In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. > > src/hotspot/share/code/nmethod.cpp line 1508: > >> 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER >> 1507: // Direct calls may no longer be in range and the use of a trampoline may now be required. >> 1508: // Instead, allow trampoline relocations to update their owners and perform the necessary checks. > > `Instead` is wrong word here I think. May be `Otherwise`. > > Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`. We do not add trampolines to the new nmethod if they were not present in the original. Does this comment better describe the need to do this? // A direct call whose destination was within the maximum branch range may now // be out of range after the nmethod is moved. // // CallRelocation::fix_relocation_after_move() does not perform range checks and // assumes that the call target is always directly reachable. If we were to call // it unconditionally, it could incorrectly rewrite a call site whose target now // requires a trampoline, leaving the call out of range. // // When a call site has an associated trampoline, we skip the normal call // relocation here. The corresponding trampoline_stub_Relocation will handle both // the call site and the trampoline, including performing the required range // checks and updating the call to branch through the trampoline if required. // // If no trampoline exists for the call, we know the target remains within the // direct-branch range and CallRelocation::fix_relocation_after_move() is safe. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551119626 From eastigeevich at openjdk.org Fri Nov 21 23:02:07 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 21 Nov 2025 23:02:07 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance In-Reply-To: <6O0YDvGtf8yNNsqgZeZtyJlk6GlGVjXDKwOX-JcUIi4=.6c669dbd-4653-4282-93ef-8129d7c13bdd@github.com> References: <6O0YDvGtf8yNNsqgZeZtyJlk6GlGVjXDKwOX-JcUIi4=.6c669dbd-4653-4282-93ef-8129d7c13bdd@github.com> Message-ID: On Thu, 20 Nov 2025 21:23:16 GMT, Dean Long wrote: > It seems a little disruptive to have to pass `defer_icache_invalidation` around so much. What about attaching this information to the Thread or using a THREAD_LOCAL? I switched to a THREAD_LOCAL. Initially it regressed fullGG comparing to the version with the parameter: - Parameter: Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 88.865 ? 19.299 ms/op GCPatchingNmethodCost.fullGC 2 5000 avgt 3 146.184 ? 11.531 ms/op GCPatchingNmethodCost.fullGC 4 5000 avgt 3 186.429 ? 16.257 ms/op GCPatchingNmethodCost.fullGC 8 5000 avgt 3 262.933 ? 13.071 ms/op - THREAD_LOCAL Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 93.899 ? 14.870 ms/op GCPatchingNmethodCost.fullGC 2 5000 avgt 3 152.872 ? 13.566 ms/op GCPatchingNmethodCost.fullGC 4 5000 avgt 3 194.425 ? 37.851 ms/op GCPatchingNmethodCost.fullGC 8 5000 avgt 3 271.826 ? 47.908 ms/op I found that `ZBarrierSetAssembler::patch_barrier_relocation` is only used when icache invalidation is deferred. I replaced a check of the thread local value with a check of `NeoverseN1Errata1542419`. This restored the performance: Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 84.919 ? 31.411 ms/op GCPatchingNmethodCost.fullGC 2 5000 avgt 3 141.862 ? 7.026 ms/op GCPatchingNmethodCost.fullGC 4 5000 avgt 3 184.921 ? 46.592 ms/op GCPatchingNmethodCost.fullGC 8 5000 avgt 3 263.897 ? 48.271 ms/op It might be that accesses to THREAD_LOCAL on Neoverse N1 are expensive. Should I try attaching info to Thread? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3564915607 From sparasa at openjdk.org Fri Nov 21 23:57:07 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 21 Nov 2025 23:57:07 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v3] In-Reply-To: References: Message-ID: > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.263 > 2 | 0.46 | 0.16 | 0.264 > 5 | 0.46 | 0.29 | 0.299 > 10 | 0.46 | 0.58 | 0.303 > 15 | 0.46 | 0.42 | 0.271 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.299 > 20 | 0.21 | 0.37 | 0.299 > 25 | 0.21 | 0.59 | 0.282 > 31 | 0.21 | 0.53 | 0.273 > 32 | 0.21 | 0.58 | 0.199 > 35 | 0.5 | 0.77 | 0.259 > 40 | 0.5 | 0.61 | 0.33 > 45 | 0.5 | 0.52 | 0.281 > 48 | 0.5 | 0.66 | 0.32 > 49 | 0.22 | 0.69 | 0.3 > 50 | 0.22 | 0.78 | 0.3 > 55 | 0.22 | 0.67 | 0.292 > 60 | 0.22 | 0.67 | 0.3293 > 64 | 0.22 | 0.82 | 0.23 > 70 | 0.51 | 1.1 | 0.34 > 80 | 0.49 | 0.89 | 0.365 > 90 | 0.225 | 0.68 | 0.33 > 100 | 0.54 | 1.09 | 0.347 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.289 Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: refactor code to use fill32_tail at the end of the stub ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/ee1db381..1371d556 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=01-02 Stats: 43 lines in 2 files changed: 20 ins; 9 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From sparasa at openjdk.org Fri Nov 21 23:57:13 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 21 Nov 2025 23:57:13 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 21:56:13 GMT, Sandhya Viswanathan wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> undo size check for fill64_masked > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9245: > >> 9243: } >> 9244: >> 9245: void MacroAssembler::fill32_unmasked(uint shift, Register dst, int disp, XMMRegister xmm, > > This could be called as fill32_tail. Also good to replace overall fill32_masked with fill32_tail. Please see this suggestion incorporated in the updated code. > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9305: > >> 9303: } >> 9304: >> 9305: void MacroAssembler::fill64_unmasked(uint shift, Register dst, int disp, > > This could be called as fill64_tail. Also good to replace overall fill64_masked with fill64_tail. Please see this suggestion incorporated in the updated code. > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9362: > >> 9360: jcc(Assembler::greater, L_fill_64_bytes); >> 9361: fill32_unmasked(shift, to, 0, xtmp, count, rtmp); >> 9362: jmp(L_exit); > > Instead of repeating fill32_unmasked multiple time, you could jmp to say L_fill_32_tail and have the fill32_unmasked code there one time. Please see this suggestion incorporated in the updated code. > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9383: > >> 9381: fill64(to, 0, xtmp); >> 9382: subq(count, 64 >> shift); >> 9383: fill32_unmasked(shift, to, 64, xtmp, count, rtmp); > > Instead of repeating fill64_unmasked multiple time, you could jmp to say L_fill_64_tail and have the fill64_unmasked code there one time. Please see this suggestion incorporated in the updated code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551288005 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551288279 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551289093 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551289710 From sparasa at openjdk.org Sat Nov 22 00:03:47 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Sat, 22 Nov 2025 00:03:47 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v4] In-Reply-To: References: Message-ID: > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.263 > 2 | 0.46 | 0.16 | 0.264 > 5 | 0.46 | 0.29 | 0.299 > 10 | 0.46 | 0.58 | 0.303 > 15 | 0.46 | 0.42 | 0.271 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.299 > 20 | 0.21 | 0.37 | 0.299 > 25 | 0.21 | 0.59 | 0.282 > 31 | 0.21 | 0.53 | 0.273 > 32 | 0.21 | 0.58 | 0.199 > 35 | 0.5 | 0.77 | 0.259 > 40 | 0.5 | 0.61 | 0.33 > 45 | 0.5 | 0.52 | 0.281 > 48 | 0.5 | 0.66 | 0.32 > 49 | 0.22 | 0.69 | 0.3 > 50 | 0.22 | 0.78 | 0.3 > 55 | 0.22 | 0.67 | 0.292 > 60 | 0.22 | 0.67 | 0.3293 > 64 | 0.22 | 0.82 | 0.23 > 70 | 0.51 | 1.1 | 0.34 > 80 | 0.49 | 0.89 | 0.365 > 90 | 0.225 | 0.68 | 0.33 > 100 | 0.54 | 1.09 | 0.347 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.289 Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: undo jccb to jcc change as needed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/1371d556..57dc6c4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=02-03 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From sparasa at openjdk.org Sat Nov 22 00:03:49 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Sat, 22 Nov 2025 00:03:49 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 22:00:12 GMT, Sandhya Viswanathan wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> undo size check for fill64_masked > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9380: > >> 9378: bind(L_fill_96_bytes); >> 9379: cmpq(count, 96 >> shift); >> 9380: jcc(Assembler::greater, L_fill_128_bytes); > > With the suggestion to have fill32_unmasked and fill64_unmasked one time, you may be able to retain the jccb. The jccb to jcc was undone as needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551311201 From kvn at openjdk.org Sat Nov 22 00:08:48 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 00:08:48 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 22:22:53 GMT, Chad Rakoczy wrote: >> src/hotspot/share/code/nmethod.cpp line 1508: >> >>> 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER >>> 1507: // Direct calls may no longer be in range and the use of a trampoline may now be required. >>> 1508: // Instead, allow trampoline relocations to update their owners and perform the necessary checks. >> >> `Instead` is wrong word here I think. May be `Otherwise`. >> >> Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`. > > We do not add trampolines to the new nmethod if they were not present in the original. > > Does this comment better describe the need to do this? > > // A direct call whose destination was within the maximum branch range may now > // be out of range after the nmethod is moved. > // > // CallRelocation::fix_relocation_after_move() does not perform range checks and > // assumes that the call target is always directly reachable. If we were to call > // it unconditionally, it could incorrectly rewrite a call site whose target now > // requires a trampoline, leaving the call out of range. > // > // When a call site has an associated trampoline, we skip the normal call > // relocation here. The corresponding trampoline_stub_Relocation will handle both > // the call site and the trampoline, including performing the required range > // checks and updating the call to branch through the trampoline if required. > // > // If no trampoline exists for the call, we know the target remains within the > // direct-branch range and CallRelocation::fix_relocation_after_move() is safe. Okay, I now get it that the comments try to explain why we need to call fix_relocation_after_move(). I am not questioning this. My question is about the case when you can't patch address in existing instructions set. I assume you should bailout from this cloning or you should always generate instruction set pin original nmethod assuming far distance. Okay, there are 3 cases as I understand: 1. There was trampoline call in original nmethod. We do nothing here (hit `continue`) because the trampoline code will be updated (I see its is guarded by `#ifdef USE_TRAMPOLINE_STUB_FIX_OWNER`). Good. 2. There was no trampoline call in original nmethod and new nmethod still in range of destination address and set of instructions allows `fix_relocation_after_move()` correctly update destination. 3. There was no trampoline call in original nmethod and new nmethod not in range of destination address and existing instruction set is not enough to reconstruct address - there is need for trampoline call or more complex set of instruction to construct destination. My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551327903 From kvn at openjdk.org Sat Nov 22 00:08:49 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 00:08:49 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 00:04:33 GMT, Vladimir Kozlov wrote: >> We do not add trampolines to the new nmethod if they were not present in the original. >> >> Does this comment better describe the need to do this? >> >> // A direct call whose destination was within the maximum branch range may now >> // be out of range after the nmethod is moved. >> // >> // CallRelocation::fix_relocation_after_move() does not perform range checks and >> // assumes that the call target is always directly reachable. If we were to call >> // it unconditionally, it could incorrectly rewrite a call site whose target now >> // requires a trampoline, leaving the call out of range. >> // >> // When a call site has an associated trampoline, we skip the normal call >> // relocation here. The corresponding trampoline_stub_Relocation will handle both >> // the call site and the trampoline, including performing the required range >> // checks and updating the call to branch through the trampoline if required. >> // >> // If no trampoline exists for the call, we know the target remains within the >> // direct-branch range and CallRelocation::fix_relocation_after_move() is safe. > > Okay, I now get it that the comments try to explain why we need to call fix_relocation_after_move(). > I am not questioning this. My question is about the case when you can't patch address in existing instructions set. I assume you should bailout from this cloning or you should always generate instruction set pin original nmethod assuming far distance. > > Okay, there are 3 cases as I understand: > 1. There was trampoline call in original nmethod. We do nothing here (hit `continue`) because the trampoline code will be updated (I see its is guarded by `#ifdef USE_TRAMPOLINE_STUB_FIX_OWNER`). Good. > 2. There was no trampoline call in original nmethod and new nmethod still in range of destination address and set of instructions allows `fix_relocation_after_move()` correctly update destination. > 3. There was no trampoline call in original nmethod and new nmethod not in range of destination address and existing instruction set is not enough to reconstruct address - there is need for trampoline call or more complex set of instruction to construct destination. > > My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? I also assume that trampoline's code instructions can construct far distance address. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551333425 From dlong at openjdk.org Sat Nov 22 00:15:56 2025 From: dlong at openjdk.org (Dean Long) Date: Sat, 22 Nov 2025 00:15:56 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance In-Reply-To: References: <6O0YDvGtf8yNNsqgZeZtyJlk6GlGVjXDKwOX-JcUIi4=.6c669dbd-4653-4282-93ef-8129d7c13bdd@github.com> Message-ID: On Fri, 21 Nov 2025 22:58:42 GMT, Evgeny Astigeevich wrote: > Should I try attaching info to Thread? It probably won't help, because Thread::current() is going to access a thread-local. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3565060394 From dlong at openjdk.org Sat Nov 22 00:22:59 2025 From: dlong at openjdk.org (Dean Long) Date: Sat, 22 Nov 2025 00:22:59 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 22:21:52 GMT, Evgeny Astigeevich wrote: >> Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. >> >> Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: >> - Disable coherent icache. >> - Trap IC IVAU instructions. >> - Execute: >> - `tlbi vae3is, xzr` >> - `dsb sy` >> >> `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. >> >> As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: >> >> "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." >> >> This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. >> >> Changes include: >> >> * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. >> * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. >> * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. >> * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. >> >> Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) >> >> - Baseline >> >> $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1... > > Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: > > Use THREAD_LOCAL deferred_icache_invalidation instead of parameter If I understand correctly, the whole icache is flushed, so the actual nmethod* is irrelevant. So instead of `ICacheInvalidationContext icic(nm)` for every different "nm", can't we just do `ICacheInvalidationContext icic(true)` one time, outside the nmethod loop? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3565078961 From duke at openjdk.org Sat Nov 22 00:46:39 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Sat, 22 Nov 2025 00:46:39 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 00:06:06 GMT, Vladimir Kozlov wrote: >> Okay, I now get it that the comments try to explain why we need to call fix_relocation_after_move(). >> I am not questioning this. My question is about the case when you can't patch address in existing instructions set. I assume you should bailout from this cloning or you should always generate instruction set pin original nmethod assuming far distance. >> >> Okay, there are 3 cases as I understand: >> 1. There was trampoline call in original nmethod. We do nothing here (hit `continue`) because the trampoline code will be updated (I see its is guarded by `#ifdef USE_TRAMPOLINE_STUB_FIX_OWNER`). Good. >> 2. There was no trampoline call in original nmethod and new nmethod still in range of destination address and set of instructions allows `fix_relocation_after_move()` correctly update destination. >> 3. There was no trampoline call in original nmethod and new nmethod not in range of destination address and existing instruction set is not enough to reconstruct address - there is need for trampoline call or more complex set of instruction to construct destination. >> >> My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? > > I also assume that trampoline's code instructions can construct far distance address. > My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? We should never run into the 3rd case. If a trampoline _may_ be needed it will be there. A trampoline will not be generated only if the destination is known to always be reachable. Here are some situations where this could happen: - no far branches (code cache size <= branch range) - runtime call is reachable from anywhere in code cache - (code cache begin - runtime call <= branch range) && (code cache end - runtime call <= branch range) Whether or not a trampoline is generated is dependent on the callee destination not the caller address. So we can't have the case where a trampoline is needed for a given call but it doesn't exist. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2551465689 From sviswanathan at openjdk.org Sat Nov 22 01:20:35 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Sat, 22 Nov 2025 01:20:35 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 23:53:17 GMT, Srinivas Vamsi Parasa wrote: >> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9305: >> >>> 9303: } >>> 9304: >>> 9305: void MacroAssembler::fill64_unmasked(uint shift, Register dst, int disp, >> >> This could be called as fill64_tail. Also good to replace overall fill64_masked with fill64_tail. > > Please see this suggestion incorporated in the updated code. Thanks @vamsi-parasa. It will be also good to remove fill64_masked and fill32_masked overall. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2551555992 From sspitsyn at openjdk.org Sat Nov 22 09:00:50 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:00:50 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:55:35 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Add Alan's comment in VirtualThread > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 147: > >> 145: MonitorLocker ml(VTMSTransition_lock); >> 146: while (is_start_transition_disabled(current, vth())) { >> 147: ml.wait(200); > > I see a lot of timed-waits throughout this code. Is that because we poll rather than synchronizing properly? All this potential busy-waiting is surely going to cause performance glitches. The timeouts are for reliability purposes only. Technically, they are not needed and can be removed after this code becomes stable. The `wait()` calls are inside while loop which rechecks the loop-ending conditions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552610777 From sspitsyn at openjdk.org Sat Nov 22 09:00:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:00:49 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition src/hotspot/share/runtime/javaThread.cpp line 1173: > 1171: bool JavaThread::java_suspend(bool register_vthread_SR) { > 1172: #if INCLUDE_JVMTI > 1173: // Suspending a JavaThread in VTMS transition or disabling VTMS transitions can cause deadlocks. Q: I wonder if the `#if INCLUDE_JVMTI` and `#endif` can be removed here. src/hotspot/share/runtime/mountUnmountDisabler.cpp line 126: > 124: || global_start_transition_disable_count() > base_disable_count > 125: JVMTI_ONLY(|| (JvmtiVTSuspender::is_vthread_suspended(java_lang_Thread::thread_id(vthread)) || thread->is_suspended())); > 126: } I like this approach with the JVMTIStartTransition and JVMTIEndTransition helper classes. It is a nice way to decouple the JVMTI part of the protocol. Introducing the `is_start_transition_disabled()` function was also long desired. Also, I like the functions `start_transition()` and `end_transition()` became pretty simple and clean! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552502964 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552624330 From sspitsyn at openjdk.org Sat Nov 22 09:00:52 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:00:52 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 00:52:05 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52: > >> 50: // parameter is_SR: suspender or resumer >> 51: MountUnmountDisabler(bool exlusive = false); >> 52: MountUnmountDisabler(oop thread_oop); > > What does the comment mean here? This comment is stale now and must be removed. The parameter `is_SR` is being replaced with the `exclusive`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552577039 From sspitsyn at openjdk.org Sat Nov 22 09:15:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 22 Nov 2025 09:15:49 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Rename VM methods for endFirstTransition/startFinalTransition I've completed my first pass trough this update, and it looks pretty solid in general. I'm going to make another pass next week. src/hotspot/share/prims/jvm.cpp line 3682: > 3680: JVM_ENTRY(void, JVM_VirtualThreadEndTransition(JNIEnv* env, jobject vthread, jboolean is_mount)) > 3681: oop vt = JNIHandles::resolve_external_guard(vthread); > 3682: MountUnmountDisabler::end_transition(thread, vt, is_mount, false /*is_thread_start*/); The `JVM_VirtualThread*` functions have been nicely simplified. ------------- PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3496249775 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2552682664 From fandreuzzi at openjdk.org Sat Nov 22 11:37:26 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Sat, 22 Nov 2025 11:37:26 GMT Subject: RFR: 8372348: Adjust some UL / JFR string deduplication output messages In-Reply-To: <7-XRp229KFw3V2bFPMnaWaoPdF3ZCYVNuViEo2O7eUI=.e0247347-188b-44bf-935c-6b0186026fd3@github.com> References: <7-XRp229KFw3V2bFPMnaWaoPdF3ZCYVNuViEo2O7eUI=.e0247347-188b-44bf-935c-6b0186026fd3@github.com> Message-ID: <2r6k6Av9MTNmMxpbga6MJoUCu0vECsr7IDL6wAzJ_Ig=.c9106917-1635-4d3c-b6c4-f4c5f0659440@github.com> On Fri, 21 Nov 2025 15:19:54 GMT, Matthias Baesken wrote: > There is some UL output in the string deduplication code that is not very clear and has room for improvement. > The inspected strings number should be shown and the new unknown strings get a changed text. > (also the new JFR strip dedup event description is slightly adjusted) src/hotspot/share/gc/shared/stringdedup/stringDedupStat.cpp line 218: > 216: log_debug(stringdedup)(" Known: %12zu(%5.1f%%)", _known, known_percent); > 217: log_debug(stringdedup)(" Shared: %12zu(%5.1f%%)", _known_shared, known_shared_percent); > 218: log_debug(stringdedup)(" New unknown: %12zu(%5.1f%%)" STRDEDUP_BYTES_FORMAT, I'm wondering if just `Unknown` would be more self-explanatory than `New unknown`? We have `Known` already, and `New unknown` is the complement of `Known` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28455#discussion_r2553079222 From jsjolen at openjdk.org Sat Nov 22 12:14:22 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sat, 22 Nov 2025 12:14:22 GMT Subject: RFR: 8372373: Make ResolutionErrorEntry's interface less susceptible to memory leaks Message-ID: Hi, Ioi discovered that the `ResolutionErrorEntry`'s two constructors are working in two different ways: One of them copies the string, and the other one requires the string to already be on the CHeap. This leads to code that's difficult to understand and increases the risk of memory leaks. This PR makes the interface more uniform. I've done a few style changes as well, hope those are OK. Here's a list of the callsites I found: constantPool.cpp:992: Strings are Resource-allocated and strdupped cpCache.cpp:746 Strings are Resource-allocated and strupped instanceKlass.cpp:317 Was manually CHeap allocated and transferred to REE, now wrapped in stringStream and strdupped instanceKlass.cpp:361 Was manually CHeap allocated and transferred to REE, now wrapped in stringStream and strdupped systemDictionary.cpp:1872 Potential memory leak via set_nest_host_error ------------- Commit messages: - Make ResolutionErrorEntry more uniform Changes: https://git.openjdk.org/jdk/pull/28466/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28466&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372373 Stats: 33 lines in 3 files changed: 2 ins; 17 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/28466.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28466/head:pull/28466 PR: https://git.openjdk.org/jdk/pull/28466 From fandreuzzi at openjdk.org Sat Nov 22 12:56:54 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Sat, 22 Nov 2025 12:56:54 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed Message-ID: The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. ------------- Commit messages: - volatile - remove Changes: https://git.openjdk.org/jdk/pull/28467/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28467&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372324 Stats: 6 lines in 1 file changed: 3 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28467/head:pull/28467 PR: https://git.openjdk.org/jdk/pull/28467 From jsjolen at openjdk.org Sat Nov 22 14:40:41 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sat, 22 Nov 2025 14:40:41 GMT Subject: RFR: 8372373: Make ResolutionErrorEntry's interface less susceptible to memory leaks [v2] In-Reply-To: References: Message-ID: > Hi, > > Ioi discovered that the `ResolutionErrorEntry`'s two constructors are working in two different ways: One of them copies the string, and the other one requires the string to already be on the CHeap. This leads to code that's difficult to understand and increases the risk of memory leaks. This PR makes the interface more uniform. > > I've done a few style changes as well, hope those are OK. > > Here's a list of the callsites I found: > > constantPool.cpp:992: > Strings are Resource-allocated and strdupped > cpCache.cpp:746 > Strings are Resource-allocated and strupped > instanceKlass.cpp:317 > Was manually CHeap allocated and transferred to REE, now wrapped in stringStream and strdupped > instanceKlass.cpp:361 > Was manually CHeap allocated and transferred to REE, now wrapped in stringStream and strdupped > systemDictionary.cpp:1872 > Potential memory leak via set_nest_host_error Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Some renaming so that you don't misunderstand ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28466/files - new: https://git.openjdk.org/jdk/pull/28466/files/ea21ad3e..8dc79c19 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28466&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28466&range=00-01 Stats: 14 lines in 2 files changed: 8 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28466.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28466/head:pull/28466 PR: https://git.openjdk.org/jdk/pull/28466 From kvn at openjdk.org Sat Nov 22 16:57:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 16:57:50 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 00:43:59 GMT, Chad Rakoczy wrote: >> I also assume that trampoline's code instructions can construct far distance address. > >> My question is how you handle 3rd case? And how you distinguish 2 and 3 cases? > > We should never run into the 3rd case. If a trampoline _may_ be needed it will be there. > > A trampoline will not be generated only if the destination is known to always be reachable. Here are some situations where this could happen: > - no far branches (code cache size <= branch range) > - runtime call is reachable from anywhere in code cache > - (code cache begin - runtime call <= branch range) && (code cache end - runtime call <= branch range) > > Whether or not a trampoline is generated is dependent on the callee destination not the caller address. So we can't have the case where a trampoline is needed for a given call but it doesn't exist. May be we should change the assert to guarantee in `Relocation::pd_set_call_destination()` to make sure we catch incorrect patching it product VM. Looking on `NativeCall::set_destination_mt_safe` and `reachable` is calculated based on distance between address of call instruction and destination. Which could be different for cloned nmethod. On x86 were have guarantee in `NativeCall::set_destination()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2553244181 From kvn at openjdk.org Sat Nov 22 17:04:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 22 Nov 2025 17:04:53 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy wrote: > [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046) > > This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation. > > --- > > #### 1. Test Bug > > It?s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn?t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually). > > The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock. > > This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m` > > > After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed. > > --- > > #### 2. Implementation Bug > > `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets. > > Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed. > > The fix ensures that all call sites are patched **before** the `nmethod` is registered. > > In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs. Tobias's testing dod not find any new failures. I am still concern about patching. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28241#issuecomment-3566888592 From eosterlund at openjdk.org Sun Nov 23 10:36:47 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Sun, 23 Nov 2025 10:36:47 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v2] In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 00:20:04 GMT, Dean Long wrote: > If I understand correctly, the whole icache is flushed, so the actual nmethod* is irrelevant. So instead of `ICacheInvalidationContext icic(nm)` for every different "nm", can't we just do `ICacheInvalidationContext icic(true)` one time, outside the nmethod loop? We can't disarm an nmethod before flushing the instructions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3567803014 From jbhateja at openjdk.org Sun Nov 23 11:50:08 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 23 Nov 2025 11:50:08 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v2] In-Reply-To: References: Message-ID: > Add a new Float16lVector type and corresponding concrete vector classes, in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. > - Add necessary inline expander support. > - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. > - Use existing Float16 vector IR and backend support. > - Extended the existing VectorAPI JTREG test suite for the newly added HalffloatVector operations. > > The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). > > The following are the performance numbers for some of the selected Float16Vector benchmarking kernels compared to equivalent auto-vectorized Float16OperationsBenchmark kernels. > > image > > Initial RFP[1] was floated on the panama-dev mailing list. > > Kindly review the draft PR and share your feedback. > > Best Regards, > Jatin > > [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Cleaning up interface as per review suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28002/files - new: https://git.openjdk.org/jdk/pull/28002/files/c60d533c..ea3ef19b Webrevs: - full: Webrev is not available because diff is too large - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28002&range=00-01 Stats: 162997 lines in 187 files changed: 75266 ins; 74548 del; 13183 mod Patch: https://git.openjdk.org/jdk/pull/28002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28002/head:pull/28002 PR: https://git.openjdk.org/jdk/pull/28002 From aph at openjdk.org Sun Nov 23 14:37:22 2025 From: aph at openjdk.org (Andrew Haley) Date: Sun, 23 Nov 2025 14:37:22 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v2] In-Reply-To: References: Message-ID: On Sun, 23 Nov 2025 10:32:15 GMT, Erik ?sterlund wrote: > > If I understand correctly, the whole icache is flushed, so the actual nmethod* is irrelevant. So instead of `ICacheInvalidationContext icic(nm)` for every different "nm", can't we just do `ICacheInvalidationContext icic(true)` one time, outside the nmethod loop? > > We can't disarm an nmethod before flushing the instructions. Sure, but you can't patch an nmethod until every thread that might be executing it has stopped. So if the threads are all stopped, why not postpone the disarmament until the end, just before you flush? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3568021134 From duke at openjdk.org Sun Nov 23 17:56:48 2025 From: duke at openjdk.org (duke) Date: Sun, 23 Nov 2025 17:56:48 GMT Subject: Withdrawn: 8358890: VM option -XX:AllowRedefinitionToAddDeleteMethods should be obsoleted then expired In-Reply-To: References: Message-ID: On Thu, 10 Jul 2025 01:12:11 GMT, Serguei Spitsyn wrote: > The VM option -XX:AllowRedefinitionToAddDeleteMethods was added in JDK 13 as a temporary backward compatibility flag under JDK-8192936 and was immediately marked as Deprecate. The fix is to obsolete this option in JDK 26 and expire in JDK 27. > > TBD: Need to submit a related CSR. > > There are two concerns which may require some negotiation with the Runtime (@coleenp @dcubed-ojdk @dholmes-ora) and SQE (@lmesnik) teams: > - Class redefinition/retransformation can impact lambda expressions which are supported with private methods > - Many tests depend on this VM option and are being removed. I'm not sure if it is okay to completely remove those e may want another way to handle this (e.g. problem-listing the impacted tests for now). > > Testing: > - mach5 tiers 1-6 are good > - may need to run mach5 tiers > 6 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/26232 From vpetko at openjdk.org Sun Nov 23 19:40:53 2025 From: vpetko at openjdk.org (Vladimir Petko) Date: Sun, 23 Nov 2025 19:40:53 GMT Subject: RFR: 8352567: [s390x] disable JFR tests requiring JFR stubs In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 01:48:27 GMT, Vladimir Petko wrote: > JFR stubs are not [implemented](https://github.com/openjdk/jdk/blame/06ba6cf3a137a6cdf572a876a46d18e51c248451/src/hotspot/cpu/s390/sharedRuntime_s390.cpp#L3412). > Add platform requirement to JFR tests that require JFR stubs to skip them on S390x. > > Testing: > - s390x: > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR SKIP > jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java > 0 0 0 0 0 > jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java > 0 0 0 0 0 > jtreg:test/jdk/jdk/jfr 630 577 0 0 53 > ============================== > TEST SUCCESS > > > - amd64: > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR SKIP > jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java > 1 1 0 0 0 > jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java > 1 1 0 0 0 > jtreg:test/jdk/jdk/jfr 629 622 0 0 7 > ============================== > TEST SUCCESS @offamitkumar Hi, would it be possible to take a look? =) ------------- PR Comment: https://git.openjdk.org/jdk/pull/28444#issuecomment-3568269502 From serb at openjdk.org Sun Nov 23 22:39:32 2025 From: serb at openjdk.org (Sergey Bylokhov) Date: Sun, 23 Nov 2025 22:39:32 GMT Subject: RFR: 8345265: Minor improvements for LTO across all compilers [v2] In-Reply-To: References: Message-ID: On Tue, 17 Dec 2024 14:54:03 GMT, Julian Waters wrote: >> This is a general cleanup and improvement of LTO, as well as a quick fix to remove a workaround in the Makefiles that disabled LTO for g1ParScanThreadState.cpp due to the old poisoning mechanism causing trouble. The -Wno-attribute-warning change here can be removed once Kim's new poisoning solution is integrated. >> >> - -fno-omit-frame-pointer is added to gcc to stop the linker from emitting code without the frame pointer >> - -flto is set to $(JOBS) instead of auto to better match what the user requested >> - -Gy is passed to the Microsoft compiler. This does not fully fix LTO under Microsoft, but prevents warnings about -LTCG:INCREMENTAL at least > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-16 > - -fno-omit-frame-pointer in JvmFeatures.gmk > - Revert compilerWarnings_gcc.hpp > - General LTO fixes JvmFeatures.gmk > - Revert DISABLE_POISONING_STOPGAP compilerWarnings_gcc.hpp > - Merge branch 'openjdk:master' into patch-16 > - Revert os.cpp > - Fix memory leak in jvmciEnv.cpp > - Stopgap fix in os.cpp > - Declaration fix in compilerWarnings_gcc.hpp > - ... and 2 more: https://git.openjdk.org/jdk/compare/3cbd7aa6...9d05cb8e Just curious, how did you check that the ?no omit frame pointer? option is needed? Did you see any test failures? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22464#issuecomment-3568414088 From dholmes at openjdk.org Mon Nov 24 01:38:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 24 Nov 2025 01:38:02 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v9] In-Reply-To: <3bBZzigernOcTkARE9am0ZmHR9NWsmp3xa0ksSLYiE8=.981d0f10-5b41-40c8-a35c-f953b0d1df08@github.com> References: <3bBZzigernOcTkARE9am0ZmHR9NWsmp3xa0ksSLYiE8=.981d0f10-5b41-40c8-a35c-f953b0d1df08@github.com> Message-ID: <56YOm7QzXMRjq9Whk6uOWUSWDBd7mqNMWsLVdXuDW6s=.f476cd41-5725-474d-920b-0a43f880828c@github.com> On Fri, 21 Nov 2025 09:25:27 GMT, Anton Artemov wrote: >> Hi, >> >> please consider the following changes: >> >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. >> >> Tested in tiers 1 - 5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8366671: Addressed reviewer's comments. src/hotspot/share/utilities/spinCriticalSection.hpp line 37: > 35: // we're concerned about native mutex_t or HotSpot Mutex:: latency. > 36: // The class uses low-level leaf-lock primitives to implement > 37: // synchronization. Not for general synchronization use. Suggestion: // This class uses low-level leaf-lock primitives to implement // synchronization and is not for general synchronization use. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28264#discussion_r2554480028 From dholmes at openjdk.org Mon Nov 24 02:21:19 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 24 Nov 2025 02:21:19 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads Message-ID: There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. Testing: - manual inspection of hs_err file, for different GCs - tiers 1-3 sanity Thanks ------------- Commit messages: - Merge branch 'master' into 8369112-crash-unattached - 8372380: Make hs_err reporting more robust for unattached threads Changes: https://git.openjdk.org/jdk/pull/28470/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28470&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372380 Stats: 10 lines in 4 files changed: 5 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28470.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28470/head:pull/28470 PR: https://git.openjdk.org/jdk/pull/28470 From dholmes at openjdk.org Mon Nov 24 02:30:49 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 24 Nov 2025 02:30:49 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v2] In-Reply-To: References: Message-ID: > There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. > > Testing: > - manual inspection of hs_err file, for different GCs > - tiers 1-3 sanity > > Thanks David Holmes has updated the pull request incrementally with one additional commit since the last revision: Missing include ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28470/files - new: https://git.openjdk.org/jdk/pull/28470/files/64bdb1b2..5d59c7e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28470&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28470&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28470.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28470/head:pull/28470 PR: https://git.openjdk.org/jdk/pull/28470 From dholmes at openjdk.org Mon Nov 24 02:34:39 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 24 Nov 2025 02:34:39 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: References: Message-ID: > There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. > > Testing: > - manual inspection of hs_err file, for different GCs > - tiers 1-3 sanity > > Thanks David Holmes has updated the pull request incrementally with one additional commit since the last revision: Fix include order ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28470/files - new: https://git.openjdk.org/jdk/pull/28470/files/5d59c7e9..57ad332b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28470&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28470&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28470.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28470/head:pull/28470 PR: https://git.openjdk.org/jdk/pull/28470 From jwaters at openjdk.org Mon Nov 24 08:03:44 2025 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 24 Nov 2025 08:03:44 GMT Subject: RFR: 8345265: Minor improvements for LTO across all compilers [v2] In-Reply-To: References: Message-ID: On Sun, 23 Nov 2025 22:35:53 GMT, Sergey Bylokhov wrote: > Just curious, how did you check that the ?no omit frame pointer? option is needed? Did you see any test failures? gcc documentation states that you typically need to pass the same options to the link step from the compile step, since the linker when LTO is active is actually the compiler in disguise. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22464#issuecomment-3569393733 From shade at openjdk.org Mon Nov 24 08:14:04 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 24 Nov 2025 08:14:04 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 18:27:55 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Adjust label name >> - Merge branch 'master' into JDK-8372285-g1-barrier-micro >> - Make some backward branches explicitly short >> - Comment >> - Shorten a few more branches >> - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short >> - More touchups >> - Also optimize queue insertion >> - Touchups >> - WIP > > src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 92: > >> 90: void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, >> 91: Register addr, Register count, Register tmp) { >> 92: Label done; > > Since you are touching this code can you add `L_` to labels in this code? > This is our usual practice for labels to clear see them. Done. > src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 282: > >> 280: Register thread = r15_thread; >> 281: >> 282: Label done; > > Please use `L_done`. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2555031155 PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2555031433 From mbaesken at openjdk.org Mon Nov 24 08:23:55 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 24 Nov 2025 08:23:55 GMT Subject: RFR: 8372348: Adjust some UL / JFR string deduplication output messages In-Reply-To: <2r6k6Av9MTNmMxpbga6MJoUCu0vECsr7IDL6wAzJ_Ig=.c9106917-1635-4d3c-b6c4-f4c5f0659440@github.com> References: <7-XRp229KFw3V2bFPMnaWaoPdF3ZCYVNuViEo2O7eUI=.e0247347-188b-44bf-935c-6b0186026fd3@github.com> <2r6k6Av9MTNmMxpbga6MJoUCu0vECsr7IDL6wAzJ_Ig=.c9106917-1635-4d3c-b6c4-f4c5f0659440@github.com> Message-ID: On Sat, 22 Nov 2025 11:33:47 GMT, Francesco Andreuzzi wrote: >> There is some UL output in the string deduplication code that is not very clear and has room for improvement. >> The inspected strings number should be shown and the new unknown strings get a changed text. >> (also the new JFR strip dedup event description is slightly adjusted) > > src/hotspot/share/gc/shared/stringdedup/stringDedupStat.cpp line 218: > >> 216: log_debug(stringdedup)(" Known: %12zu(%5.1f%%)", _known, known_percent); >> 217: log_debug(stringdedup)(" Shared: %12zu(%5.1f%%)", _known_shared, known_shared_percent); >> 218: log_debug(stringdedup)(" New unknown: %12zu(%5.1f%%)" STRDEDUP_BYTES_FORMAT, > > I'm wondering if just `Unknown` would be more self-explanatory than `New unknown`? We have `Known` already, and `New unknown` is the complement of `Known` Maybe , but the variables are named `_new / _new_bytes` and the JFR fields also have 'new' in the name. So it maybe makes sense to keep 'new' in the UL and related output. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28455#discussion_r2555064648 From mbaesken at openjdk.org Mon Nov 24 08:53:46 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 24 Nov 2025 08:53:46 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: <1wO6zP2Fa09T1ESLYp4I-Fgw07qe5XHvBKM7Ft5MfPk=.884e2b16-ad2d-48c8-87ed-a0ea08ef3f73@github.com> On Sat, 22 Nov 2025 12:49:44 GMT, Francesco Andreuzzi wrote: > The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. Marked as reviewed by mbaesken (Reviewer). If you prefer, I can test your PR for some days in our CI to check for further issues. ------------- PR Review: https://git.openjdk.org/jdk/pull/28467#pullrequestreview-3499024406 PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3569578283 From shade at openjdk.org Mon Nov 24 08:56:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 24 Nov 2025 08:56:54 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 02:34:39 GMT, David Holmes wrote: >> There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. >> >> Testing: >> - manual inspection of hs_err file, for different GCs >> - tiers 1-3 sanity >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix include order Looks reasonable, but I have questions: src/hotspot/share/compiler/compilationMemoryStatistic.cpp line 1013: > 1011: st->cr(); > 1012: } > 1013: if (Thread::current_or_null_safe() != nullptr) { Looks like `check_before_reporting()` does some pre-checks, maybe move it here, and print a helpful message about detached threads? src/hotspot/share/utilities/vmError.cpp line 667: > 665: if (MemTracker::enabled() && > 666: NmtVirtualMemory_lock != nullptr && > 667: _thread != nullptr && I do wonder if we want to do the change downstream to cover all these cases? bool Mutex::owned_by_self() const { - return owner() == Thread::current(); + return owner() == Thread::current_or_null_safe(); } ------------- PR Review: https://git.openjdk.org/jdk/pull/28470#pullrequestreview-3499014592 PR Review Comment: https://git.openjdk.org/jdk/pull/28470#discussion_r2555196065 PR Review Comment: https://git.openjdk.org/jdk/pull/28470#discussion_r2555217153 From fandreuzzi at openjdk.org Mon Nov 24 08:57:24 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 24 Nov 2025 08:57:24 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: <1wO6zP2Fa09T1ESLYp4I-Fgw07qe5XHvBKM7Ft5MfPk=.884e2b16-ad2d-48c8-87ed-a0ea08ef3f73@github.com> References: <1wO6zP2Fa09T1ESLYp4I-Fgw07qe5XHvBKM7Ft5MfPk=.884e2b16-ad2d-48c8-87ed-a0ea08ef3f73@github.com> Message-ID: On Mon, 24 Nov 2025 08:50:25 GMT, Matthias Baesken wrote: > If you prefer, I can test your PR for some days in our CI to check for further issues. That would be great, thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3569598225 From qpzhang at openjdk.org Mon Nov 24 09:22:40 2025 From: qpzhang at openjdk.org (Patrick Zhang) Date: Mon, 24 Nov 2025 09:22:40 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v8] In-Reply-To: References: Message-ID: > Issue: > In AArch64 port, `UseBlockZeroing` is by default set to true and `BlockZeroingLowLimit` is initialized to 256. If `DC ZVA` is supported, `BlockZeroingLowLimit` is later updated to `4 * VM_Version::zva_length()`. When `UseBlockZeroing` is set to false, all related conditional checks should ignore `BlockZeroingLowLimit`. However, the function `MacroAssembler::zero_words(Register base, uint64_t cnt)` still evaluates the lower limit and bases its code generation logic on it, which seems to be an incomplete conditional check. > > This PR: > 1. Reset `BlockZeroingLowLimit` to `4 * VM_Version::zva_length()` or 256 with a warning message if it was manually configured from the default while `UseBlockZeroing` is disabled. > 2. Added necessary comments in `MacroAssembler::zero_words(Register base, uint64_t cnt)` and `MacroAssembler::zero_words(Register ptr, Register cnt)` to explain why we do not check `UseBlockZeroing` in the outer part of these functions. Instead, the decision is delegated to the stub function `zero_blocks`, which encapsulates the DC ZVA instructions and serves as the inner implementation of `zero_words`. This approach helps better control the increase in code cache size during array or object instance initialization. > 3. Added more testing sizes to `test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java` to better cover scenarios involving smaller arrays and objects.. > > Tests: > 1. Performance tests on the bundled JMH `vm.compiler.ClearMemory`, and `vm.gc.RawAllocationRate` (including `arrayTest` and `instanceTest`) showed no obvious regression. Negative tests with `jdk/bin/java -jar images/test/micro/benchmarks.jar RawAllocationRate.arrayTest_C1 -bm thrpt -gc false -wi 0 -w 30 -i 1 -r 30 -t 1 -f 1 -tu s -jvmArgs "-XX:-UseBlockZeroing -XX:BlockZeroingLowLimit=8" -p size=32` demonstrated good wall times on `zero_words_reg_imm` calls, as expected. > 2. Jtreg ter1 test on Ampere Altra, AmpereOne, Graviton2 and 3, tier2 on Altra. No new issues found. Passed tests of GHA Sanity Checks. Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: Improve the comments for zero_words funcs Signed-off-by: Patrick Zhang ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26917/files - new: https://git.openjdk.org/jdk/pull/26917/files/2bbc1d04..a23ec878 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26917&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26917&range=06-07 Stats: 14 lines in 2 files changed: 0 ins; 7 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/26917.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26917/head:pull/26917 PR: https://git.openjdk.org/jdk/pull/26917 From qpzhang at openjdk.org Mon Nov 24 09:22:42 2025 From: qpzhang at openjdk.org (Patrick Zhang) Date: Mon, 24 Nov 2025 09:22:42 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v7] In-Reply-To: References: Message-ID: On Fri, 17 Oct 2025 04:19:42 GMT, Patrick Zhang wrote: >> Issue: >> In AArch64 port, `UseBlockZeroing` is by default set to true and `BlockZeroingLowLimit` is initialized to 256. If `DC ZVA` is supported, `BlockZeroingLowLimit` is later updated to `4 * VM_Version::zva_length()`. When `UseBlockZeroing` is set to false, all related conditional checks should ignore `BlockZeroingLowLimit`. However, the function `MacroAssembler::zero_words(Register base, uint64_t cnt)` still evaluates the lower limit and bases its code generation logic on it, which seems to be an incomplete conditional check. >> >> This PR: >> 1. Reset `BlockZeroingLowLimit` to `4 * VM_Version::zva_length()` or 256 with a warning message if it was manually configured from the default while `UseBlockZeroing` is disabled. >> 2. Added necessary comments in `MacroAssembler::zero_words(Register base, uint64_t cnt)` and `MacroAssembler::zero_words(Register ptr, Register cnt)` to explain why we do not check `UseBlockZeroing` in the outer part of these functions. Instead, the decision is delegated to the stub function `zero_blocks`, which encapsulates the DC ZVA instructions and serves as the inner implementation of `zero_words`. This approach helps better control the increase in code cache size during array or object instance initialization. >> 3. Added more testing sizes to `test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java` to better cover scenarios involving smaller arrays and objects.. >> >> Tests: >> 1. Performance tests on the bundled JMH `vm.compiler.ClearMemory`, and `vm.gc.RawAllocationRate` (including `arrayTest` and `instanceTest`) showed no obvious regression. Negative tests with `jdk/bin/java -jar images/test/micro/benchmarks.jar RawAllocationRate.arrayTest_C1 -bm thrpt -gc false -wi 0 -w 30 -i 1 -r 30 -t 1 -f 1 -tu s -jvmArgs "-XX:-UseBlockZeroing -XX:BlockZeroingLowLimit=8" -p size=32` demonstrated good wall times on `zero_words_reg_imm` calls, as expected. >> 2. Jtreg ter1 test on Ampere Altra, AmpereOne, Graviton2 and 3, tier2 on Altra. No new issues found. Passed tests of GHA Sanity Checks. > > Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: > > Refine the count types to pass mac and win builds > > Signed-off-by: Patrick Zhang Thanks for review, please see my updates and replies with the new commit. ------------- PR Review: https://git.openjdk.org/jdk/pull/26917#pullrequestreview-3498584631 From qpzhang at openjdk.org Mon Nov 24 09:22:43 2025 From: qpzhang at openjdk.org (Patrick Zhang) Date: Mon, 24 Nov 2025 09:22:43 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v8] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 16:15:07 GMT, Andrew Haley wrote: >> Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve the comments for zero_words funcs >> >> Signed-off-by: Patrick Zhang > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6207: > >> 6205: assert(ptr == r10 && cnt == r11, "mismatch in register usage"); >> 6206: RuntimeAddress zero_blocks = RuntimeAddress(StubRoutines::aarch64::zero_blocks()); >> 6207: assert(zero_blocks.target() != nullptr, "zero_blocks stub has not been generated"); > > What is the point of this change? There are duplicates of getting the address of `zero_blocks()` and the assertion. The first was originally introduced by [1] and got subsequently duplicated nearby with [2]. Was there a specific reason to have one copy placed after the `br(LO, around)` and another before it? I tried removing one instance and tests did not report any issue. [1] 8179444: AArch64: Put zero_words on a diet, https://github.com/openjdk/jdk/commit/1ce2a362524b7b911062fcef4ace12a355bff651#diff-fe18bdf6585d1a0d4d510f382a568c4428334d4ad941581ecc10ec60ccafca4aR4971-R4972 [2] 8270947: AArch64: C1: use zero_words to initialize all objects https://github.com/openjdk/jdk/commit/6c68ce2d396c6fe02201daf2bdb8c164de807cc1#diff-0f4150a9c607ccd590bf256daa800c0276144682a92bc6bdced5e8bc1bb81f3aR4625-R4626 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2555186546 From qpzhang at openjdk.org Mon Nov 24 09:22:48 2025 From: qpzhang at openjdk.org (Patrick Zhang) Date: Mon, 24 Nov 2025 09:22:48 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v7] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 16:15:53 GMT, Andrew Haley wrote: >> Patrick Zhang has updated the pull request incrementally with one additional commit since the last revision: >> >> Refine the count types to pass mac and win builds >> >> Signed-off-by: Patrick Zhang > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6236: > >> 6234: // The zero_blocks routine has already performed the necessary >> 6235: // adjustments to r10 and r11, ensuring they are correctly set >> 6236: // for subsequent processing. > > Suggestion: > > // A few words remain. zero_blocks() has adjusted r10 so that it > // points to the remaining words and adjusted the count in r11. Updated accordingly > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6268: > >> 6266: // and necessary unrolled str/stp expanding when the condition is not met. >> 6267: // This approach also helps prevent sudden increases in code cache size >> 6268: // when zeroing large memory areas in many places. > > Suggestion: > > // There is no need to check UseBlockZeroing here because that is > // delegated to the zero_blocks stub. The code here is inlined, so > // it is important to keep it small. Updated > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6273: > >> 6271: result = zero_words(r10, r11); >> 6272: } else { >> 6273: #ifndef PRODUCT > > What is this change for? The reason of why I swapped the `if` and `else` code block is: 1). Initially, I intended to add a check for `UseBlockZeroing` to determine whether to call `zero_words_reg_reg`. Swapping the `if` and `else` branches makes it easier to compare the behavior with and without this additional condition. 2). Later, we decided not to check `UseBlockZeroing` here but I still didn't roll back this change because the comments of warning `There is no need to check UseBlockZeroing..` should be placed before such an if condition, instead of the old one. > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6281: > >> 6279: #endif >> 6280: // Use 16 words as the block size which is 128 bytes on 64-bit systems. >> 6281: // A complete loop body will be 8 STPs unrolled there. > > Suggestion: > > // Use 16 words (128 bytes) as the block size. Updated > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 6282: > >> 6280: // Use 16 words as the block size which is 128 bytes on 64-bit systems. >> 6281: // A complete loop body will be 8 STPs unrolled there. >> 6282: const int block_size = 16; > > Naming this constant `block_size` only adds to any confusion, IMO. I wondered why `MacroAssembler::zero_words` uses 16 words to do `stp` unrolling, while `generate_zero_blocks()` 8 words (`const int MacroAssembler::zero_words_block_size = 8;`), so defined this variable to compare `8 vs 16` but did not find obvious performance difference. Regarding the var name `block_size`, could `unroll` or `unroll_words` be better? > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 694: > >> 692: // Process words with length exceeding the predefined >> 693: // block size threshold. The loop body will be unrolled based on >> 694: // the number of STPs calculated below. > > Suggestion: > > // Process any remaining blocks not handled by the stub. Updated ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2554847408 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2554861246 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2555316021 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2554863484 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2554943081 PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2554946033 From duke at openjdk.org Mon Nov 24 09:45:02 2025 From: duke at openjdk.org (Ruben) Date: Mon, 24 Nov 2025 09:45:02 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v4] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 23:56:48 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Refine `first_check_size` definitions Thank you both. I am planning to submit the `/integrate` request in a few hours. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3569791332 From shade at openjdk.org Mon Nov 24 09:45:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 24 Nov 2025 09:45:14 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v5] In-Reply-To: References: Message-ID: > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - More label renames - Jump back to L_done without shortening branch - Indent comments - Rename labels - Merge branch 'master' into JDK-8372285-g1-barrier-micro - Adjust label name - Merge branch 'master' into JDK-8372285-g1-barrier-micro - Make some backward branches explicitly short - Comment - Shorten a few more branches - ... and 5 more: https://git.openjdk.org/jdk/compare/33fcc038...b4c98d88 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28446/files - new: https://git.openjdk.org/jdk/pull/28446/files/c23bac46..b4c98d88 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=03-04 Stats: 1556 lines in 19 files changed: 1169 ins; 205 del; 182 mod Patch: https://git.openjdk.org/jdk/pull/28446.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28446/head:pull/28446 PR: https://git.openjdk.org/jdk/pull/28446 From shade at openjdk.org Mon Nov 24 09:45:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 24 Nov 2025 09:45:17 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 18:27:14 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Adjust label name >> - Merge branch 'master' into JDK-8372285-g1-barrier-micro >> - Make some backward branches explicitly short >> - Comment >> - Shorten a few more branches >> - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short >> - More touchups >> - Also optimize queue insertion >> - Touchups >> - WIP > > src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 193: > >> 191: // Is the previous value null? >> 192: __ testptr(pre_val, pre_val); >> 193: __ jccb(Assembler::equal, L_null); > > I know that this short jump will be fused to one instruction with testptr on modern x86. But you will have jump-to-jump sequence. So you may win size wise but "throughput" could be worser. Especially if it is "fast" path. > > Can you check performance of these changes vs using `jcc(Assembler::equal, L_done);` here. Well, this is technically a slow-path, I have not been able to measure any performance impact on targeted write barrier microbenchmarks. This place contributes about 0.14 pp to code size, though, so it might be a wash in the grand scheme of things: # baseline nmethod code size : 5744336 bytes nmethod code size : 5744304 bytes nmethod code size : 5738864 bytes # short (-1.65%) nmethod code size : 5650688 bytes nmethod code size : 5650656 bytes nmethod code size : 5650688 bytes # long (-1.51%) nmethod code size : 5658856 bytes nmethod code size : 5658856 bytes nmethod code size : 5658856 bytes I reverted back to `jcc(..., L_done)` to avoid any perf regression risk. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2555449065 From shade at openjdk.org Mon Nov 24 09:48:52 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 24 Nov 2025 09:48:52 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v6] In-Reply-To: References: Message-ID: <6LdV5NiSfkvLTkYDsgV2jFyw43VFzOBzaGj2Enmgrnc=.b0ddbe6b-cd41-4913-867f-dec57ed79547@github.com> > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Indenting was still off ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28446/files - new: https://git.openjdk.org/jdk/pull/28446/files/b4c98d88..797ae9c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28446&range=04-05 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/28446.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28446/head:pull/28446 PR: https://git.openjdk.org/jdk/pull/28446 From epeter at openjdk.org Mon Nov 24 09:49:57 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 24 Nov 2025 09:49:57 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 10:13:31 GMT, Roland Westrelin wrote: > > Do you know why we insert a new `CastPP` there, and why it is put not at the ctrl of the CastPP, but of the phi? I suppose the ctrl of the phi is correct, but we do lose information there, and that later prevents the `CastPP` to common. > > When the `Phi` is removed because all of its inputs are the same once uncasted, there is a risk of losing a dependency. To prevent that, a `CastPP` is inserted. All we know is that some casts along some inputs of the `Phi` may carry a dependency that we don't want to loose. The only possible control for the `CastPP` then is the one of the `Phi`. In general we can probably not do anything better. But in this case that fails here, we could have looked at both `CastPP`, and seen that they have the same ctrl, and used that one, no? But I'm not sure that is worth it yet. > The duplication comes from loop body cloning so I'm not sure how we could prevent the duplication. We could try to common the CastPP nodes once PhaseIdealLoop::peeled_dom_test_elim() is called. Right, that could be an option. Do you think that is worth it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25386#issuecomment-3569810303 From jsikstro at openjdk.org Mon Nov 24 11:01:21 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 24 Nov 2025 11:01:21 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v3] In-Reply-To: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: > Hello, > > Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. > > For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. > > However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. > >
    > > Min and Initial heap sizes before/after (expandable section) > > Before changes. We always get Min&Initial 2MB that we request: > > java -XX:+UseParallelGC -Xms2M -Xmx1G > Alignments: Space 512K, Heap 2M > Heap Min Capacity: 2M > Heap Initial Capacity: 2M > > java -XX:+UseParallelGC -XX:+UseLargePages -Xms2M -Xmx1G > MinHeapSize (2097152) must be large enough for 4 * page-size; Disabling UseLargePages for heap > Alignments: Space 512K, Heap 2M... Joel Sikstr?m has updated the pull request incrementally with three additional commits since the last revision: - Choose large page size based on MaxHeapSize - Revert "8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages" This reverts commit c02e08ade597193d70d1eb21036845bdd0304d51. - Revert "Albert review feedback" This reverts commit 66928d22112c1ac516e4b654c28249fdedf0dba9. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28394/files - new: https://git.openjdk.org/jdk/pull/28394/files/66928d22..232f1a70 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28394&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28394&range=01-02 Stats: 195 lines in 16 files changed: 111 ins; 64 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/28394.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28394/head:pull/28394 PR: https://git.openjdk.org/jdk/pull/28394 From mli at openjdk.org Mon Nov 24 11:34:11 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 24 Nov 2025 11:34:11 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC [v3] In-Reply-To: References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: On Thu, 20 Nov 2025 11:19:33 GMT, Fei Yang wrote: >> Hi, please consider this riscv-specific change. >> >> I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63: >> >> `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` >> >> The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. >> I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. >> This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. >> >> After this change, the log on BPI-F3 SBC looks like: >> >> $ java -Xlog:all -version >> >> ...... >> [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. >> [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. >> [0.011s][info][os,cpu ] Enabled RV64 feature "a" >> [0.011s][info][os,cpu ] Enabled RV64 feature "c" >> [0.011s][info][os,cpu ] Enabled RV64 feature "d" >> [0.011s][info][os,cpu ] Enabled RV64 feature "f" >> [0.011s][info][os,cpu ] Enabled RV64 feature "i" >> [0.011s][info][os,cpu ] Enabled RV64 feature "m" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" >> [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) >> [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) >> [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) >> [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) >> [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) >> [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) >> [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) >> [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. >> [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin >> ...... > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Review Looks good. Thanks! ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28340#pullrequestreview-3499929643 From mli at openjdk.org Mon Nov 24 11:56:26 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 24 Nov 2025 11:56:26 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v6] In-Reply-To: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: <7kh5C9nj7bf6432cG35kDDvV6zhnKEspe8AcYetJ1do=.e1d9ebd3-d80d-4621-8c1e-c77dc721d0df@github.com> > Hi, > > This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. > > This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. > > Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. > > # Test > ## Jtreg > > in progress... > > ## Performance > > Column names meanings: > * p: with patch > * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > * m: without patch > * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on > > #### Average improvement > > NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. > > For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. > > Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) > -- | -- | -- | -- > 1.022782609 | 2.198717391 | 2.162673913 | 2.199 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: fix is_unordered ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28309/files - new: https://git.openjdk.org/jdk/pull/28309/files/572a7b74..46b32186 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28309&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28309/head:pull/28309 PR: https://git.openjdk.org/jdk/pull/28309 From mli at openjdk.org Mon Nov 24 11:56:31 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 24 Nov 2025 11:56:31 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v5] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> Message-ID: On Fri, 21 Nov 2025 03:35:03 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> replace assert with log_warning > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1590: > >> 1588: // jump if cmp1 < cmp2 or either is NaN >> 1589: // not jump (i.e. move src to dst) if cmp1 >= cmp2 >> 1590: float_blt(cmp1, cmp2, no_set); > > I compared this with the existing `MacroAssembler::cmov_cmp_fp_ge` [1] and I witnessed some difference in the case of `NaN` handling. In `MacroAssembler::cmov_cmp_fp_ge`, we set the `is_unordered` param to true when calling `float_blt` or `double_blt`, which is not the case here. I assume we need similar handling here as well, right? > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L1338 Make sense, fixed. > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1636: > >> 1634: // jump if cmp1 <= cmp2 or either is NaN >> 1635: // not jump (i.e. move src to dst) if cmp1 > cmp2 >> 1636: float_ble(cmp1, cmp2, no_set); > > Same question here. Make sense, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2556004073 PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2556004423 From epeter at openjdk.org Mon Nov 24 11:58:02 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 24 Nov 2025 11:58:02 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) [v3] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 11:19:51 GMT, Roland Westrelin wrote: >> The test case has an out of loop `Store` with an `AddP` address >> expression that has other uses and is in the loop body. Schematically, >> only showing the address subgraph and the bases for the `AddP`s: >> >> >> Store#195 -> AddP#133 -> AddP#134 -> CastPP#110 >> -> CastPP#110 >> >> >> Both `AddP`s have the same base, a `CastPP` that's also in the loop >> body. >> >> That loop is a counted loop and only has 3 iterations so is fully >> unrolled. First, one iteration is peeled: >> >> >> /-> CastPP#110 >> Store#195 -> Phi#360 -> AddP#133 -> AddP#134 -> CastPP#110 >> -> AddP#277 -> AddP#278 -> CastPP#283 >> -> CastPP#283 >> >> >> >> The `AddP`s and `CastPP` are cloned (because in the loop body). As >> part of peeling, `PhaseIdealLoop::peeled_dom_test_elim()` is >> called. It finds the test that guards `CastPP#283` in the peeled >> iteration dominates and replaces the test that guards `CastPP#110` >> (the test in the peeled iteration is the clone of the test in the >> loop). That causes `CastPP#110`'s control to be updated to that of the >> test in the peeled iteration and to be yanked from the loop. So now >> `CastPP#283` and `CastPP#110` have the same inputs. >> >> Next unrolling happens: >> >> >> /-> CastPP#110 >> /-> AddP#400 -> AddP#401 -> CastPP#110 >> Store#195 -> Phi#360 -> Phi#477 -> AddP#133 -> AddP#134 -> CastPP#110 >> \ -> CastPP#110 >> -> AddP#277 -> AddP#278 -> CastPP#283 >> -> CastPP#283 >> >> >> >> `AddP`s are cloned once more but not the `CastPP`s because they are >> both in the peeled iteration now. A new `Phi` is added. >> >> Next igvn runs. It's going to push the `AddP`s through the `Phi`s. >> >> Through `Phi#477`: >> >> >> >> /-> CastPP#110 >> Store#195 -> Phi#360 -> AddP#510 -> Phi#509 -> AddP#401 -> CastPP#110 >> \ -> AddP#134 -> CastPP#110 >> -> AddP#277 -> AddP#278 -> CastPP#283 >> -> CastPP#283 >> >> >> >> Through `Phi#360`: >> >> >> /-> AddP#134 -> CastPP#110 >> /-> Phi#509 -> AddP#401 -> CastPP#110 >> Store#195 -> AddP#516 -> Phi#515 -> AddP#278 -> CastPP#283 >> -> Phi#514 -> CastPP#283 >> ... > > Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Merge branch 'master' into JDK-8351889 > - verif > - Merge branch 'master' into JDK-8351889 > - test seed > - more > - Merge branch 'master' into JDK-8351889 > - Merge branch 'master' into JDK-8351889 > - more > - test > - fix src/hotspot/share/opto/phaseX.cpp line 2085: > 2083: } > 2084: return false; > 2085: } Why not call it `verify_node_invariants_for`? You should also assert immediately. @benoitmaillard Is about to make that change for everything: https://github.com/openjdk/jdk/pull/28295 src/hotspot/share/opto/phaseX.hpp line 623: > 621: // '-XX:VerifyIterativeGVN=10000' > 622: return ((VerifyIterativeGVN % 100000) / 10000) == 1; > 623: } You will need to add extra documentation to the flag. And also there is a test that uses the flag. You should adjust it to enable this bit as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25386#discussion_r2556012167 PR Review Comment: https://git.openjdk.org/jdk/pull/25386#discussion_r2555714627 From fjiang at openjdk.org Mon Nov 24 14:51:00 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Mon, 24 Nov 2025 14:51:00 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC [v3] In-Reply-To: References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: On Thu, 20 Nov 2025 11:19:33 GMT, Fei Yang wrote: >> Hi, please consider this riscv-specific change. >> >> I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63: >> >> `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` >> >> The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. >> I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. >> This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. >> >> After this change, the log on BPI-F3 SBC looks like: >> >> $ java -Xlog:all -version >> >> ...... >> [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. >> [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. >> [0.011s][info][os,cpu ] Enabled RV64 feature "a" >> [0.011s][info][os,cpu ] Enabled RV64 feature "c" >> [0.011s][info][os,cpu ] Enabled RV64 feature "d" >> [0.011s][info][os,cpu ] Enabled RV64 feature "f" >> [0.011s][info][os,cpu ] Enabled RV64 feature "i" >> [0.011s][info][os,cpu ] Enabled RV64 feature "m" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" >> [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" >> [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) >> [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) >> [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) >> [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) >> [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) >> [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) >> [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) >> [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. >> [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin >> ...... > > Fei Yang has updated the pull request incrementally with one additional commit since the last revision: > > Review Marked as reviewed by fjiang (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28340#pullrequestreview-3500798209 From mbaesken at openjdk.org Mon Nov 24 14:53:33 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 24 Nov 2025 14:53:33 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size [v4] In-Reply-To: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: > The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. > Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : > (before -> after setting the option) > > 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib > 264K -> 248K images/jdk/lib/libjavajpeg.dylib > 152K -> 132K images/jdk/lib/libjli.dylib > 388K -> 296K images/jdk/lib/liblcms.dylib > 164K -> 128K images/jdk/lib/libzip.dylib > > > and libjvm : > > 20M -> 18M images/jdk/lib/server/libjvm.dylib > 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Set the -dead_strip linker option only for the JDK libs ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28319/files - new: https://git.openjdk.org/jdk/pull/28319/files/b41966b8..b63b9ca8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28319&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28319&range=02-03 Stats: 13 lines in 2 files changed: 2 ins; 10 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28319.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28319/head:pull/28319 PR: https://git.openjdk.org/jdk/pull/28319 From mbaesken at openjdk.org Mon Nov 24 15:01:56 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 24 Nov 2025 15:01:56 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: On Fri, 21 Nov 2025 08:27:51 GMT, Matthias Baesken wrote: > Maybe we should for now limit the dead_strip to the JDK native libs ? Done; additionally I set the dead_strip flag only for release builds (for (fast)debug builds we most likely are totally fine with larger binaries and debuginfo files) . ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3571211348 From duke at openjdk.org Mon Nov 24 15:46:20 2025 From: duke at openjdk.org (Zihao Lin) Date: Mon, 24 Nov 2025 15:46:20 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: Fix test failed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/329e290a..35ec9135 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=11-12 Stats: 21 lines in 1 file changed: 14 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From eastigeevich at openjdk.org Mon Nov 24 15:49:29 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 24 Nov 2025 15:49:29 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v2] In-Reply-To: References: Message-ID: On Sun, 23 Nov 2025 14:34:44 GMT, Andrew Haley wrote: > > > If I understand correctly, the whole icache is flushed, so the actual nmethod* is irrelevant. So instead of `ICacheInvalidationContext icic(nm)` for every different "nm", can't we just do `ICacheInvalidationContext icic(true)` one time, outside the nmethod loop? > > > > > > We can't disarm an nmethod before flushing the instructions. I don't think we flush the whole icache. We invalidate all translation entries (all VAs, all possible levels). I have not found any information that it would flush icache. I think TLBI is used as a heavyweight serialization barrier. It might force cores to synchronize their instruction fetch streams. We broadcast a TLB invalidation and wait for its completion. I think hardware data and instruction cache coherence still work. I also found https://lore.kernel.org/linux-arm-kernel/20191017174300.29770-1-james.morse at arm.com/ with more details on the errata workaround. These details look aligned with the hypothesis of a synchronization event to enforce ordering. The problem is: > Neoverse-N1 cores with the 'COHERENT_ICACHE' feature may fetch stale instructions when software depends on prefetch-speculation-protection instead of explicit synchronization. Prefetch-speculation-protection: > JIT can generate new instructions at some new location, then update a > branch in the executable instructions to point at the new location. > > Prefetch-speculation-protection guarantees that if another CPU sees > the new branch, it also sees the new instructions that were written > there. I think, in the case of armed/disarmed nmethods we have explicit synchronization not prefetch-speculation-protection. Neither of thread execute armed nmethods. If I am correct, disarming is a process of releasing nmethod to allow its execution. > > Sure, but you can't patch an nmethod until every thread that might be executing it has stopped. So if the threads are all stopped, why not postpone the disarmament until the end, just before you flush? If my understanding is correct, we cannot disarm before flushing because disarming is like a release of a critical section. We must guarantee all changes we've made are visible to all observers when we leave the critical section. As I wrote in the JBS issue we can: - Get all nmethods armed - Patch all of them - Invalidate TLB - Get all nmethods disarmed This will complicate the fix a lot. Performance gain from is not worth. I measured theoretic performance when we don't do any invalidation. It's 3% - 4% better than the approach in this PR: invalidate TLB per nmethod. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3571471651 From mpowers at openjdk.org Mon Nov 24 16:32:52 2025 From: mpowers at openjdk.org (Mark Powers) Date: Mon, 24 Nov 2025 16:32:52 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:55:07 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > next set of comments Always faster and never slower: SignatureBench.MLDSA with `+UseDilithiumIntrinsics` shows an average 1.61% improvement across all algorithms and data sizes. Measuring SignatureBench.MLDSA against a baseline build without the fix, shows an average 2.24% improvement across all algorithms and data sizes. There's nothing special about my benchmark. It's the one in OpenJDK (javax.crypto.full.SignatureBench). Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3571668350 From duke at openjdk.org Mon Nov 24 16:42:40 2025 From: duke at openjdk.org (Ferenc Rakoczi) Date: Mon, 24 Nov 2025 16:42:40 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:55:07 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > next set of comments Good work! I just found a few typos in the comments. src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 88: > 86: // +-----+-----+-----+-----+----- > 87: // > 88: // NOTE: size 0 and 1 are used for initial and final shuffles respectivelly of Typo: respectivelly -> respectively src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 248: > 246: // We do Montgomery multiplications of two AVX registers in 4 steps: > 247: // 1. Do the multiplications of the corresponding even numbered slots into > 248: // the odd numbered slots of a scratch2 register. Typo: scratch2 -> scratch src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 249: > 247: // 1. Do the multiplications of the corresponding even numbered slots into > 248: // the odd numbered slots of a scratch2 register. > 249: // 2. Swap the even and odd numbered slots of the original input registers.* Typo: unnecessary '*' at the end src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 250: > 248: // the odd numbered slots of a scratch2 register. > 249: // 2. Swap the even and odd numbered slots of the original input registers.* > 250: // 3. Similar to step 1, but into output register. Typo: into output register -> into an output register src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 253: > 251: // 4. Combine the outputs of step 1 and step 3 into the output of the Montgomery > 252: // multiplication. > 253: // (*For levels 0-6 in the Ntt and levels 1-7 of the inverse Ntt, need NOT swap Typo: unnecessary '(*' at the beginning src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 282: > 280: const XMMRegister* scratch = scratch1 == input1 ? output: scratch1; > 281: > 282: // scratch = input1_even*intput2_even Suggestion: // scratch = input1_even * intput2_even src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 479: > 477: // level 0 - 128 > 478: // scratch1 = coeffs3 * zetas1 > 479: // coeffs3, coeffs1 = coeffs1?scratch1 Suggestion: // coeffs3, coeffs1 = coeffs1 ? scratch1 src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 524: > 522: // coeffs1_2 = coeffs1_2 + scratch1 > 523: loadXmms(Zetas3, zetas, level * 512, vector_len, _masm); > 524: shuffle(Scratch1, Coeffs1_2, Coeffs2_2, distance * 32); //Coeffs2_2 freed Suggestion: // Coeffs2_2 freed src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 529: > 527: > 528: loadXmms(Zetas3, zetas, 4*64 + level * 512, vector_len, _masm); > 529: shuffle(Scratch1, Coeffs3_2, Coeffs4_2, distance * 32); //Coeffs4_2 freed Suggestion: // Coeffs4_2 freed src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 554: > 552: const XMMRegister Coeffs2_2[] = {xmm4, xmm5, xmm6, xmm7}; > 553: > 554: // Since we cannot fit the entire payload into registers, we process process input -> process the input src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 555: > 553: > 554: // Since we cannot fit the entire payload into registers, we process > 555: // input in two stages. First half, load 8 registers 32 integers each apart. First half -> For the first half src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 557: > 555: // input in two stages. First half, load 8 registers 32 integers each apart. > 556: // With one load, we can process level 0-2 (128-, 64- and 32-integers apart) > 557: // Remaining levels, load 8 registers from consecutive memory (16-, 8-, 4-, Remaining -> For the remaining src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 558: > 556: // With one load, we can process level 0-2 (128-, 64- and 32-integers apart) > 557: // Remaining levels, load 8 registers from consecutive memory (16-, 8-, 4-, > 558: // 2-, 1-integer appart) appart -> apart src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 559: > 557: // Remaining levels, load 8 registers from consecutive memory (16-, 8-, 4-, > 558: // 2-, 1-integer appart) > 559: // Levels 5, 6, 7 (4-, 2-, 1-integer appart) require shuffles within registers appart -> apart src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 560: > 558: // 2-, 1-integer appart) > 559: // Levels 5, 6, 7 (4-, 2-, 1-integer appart) require shuffles within registers > 560: // Other levels, shuffles can be done by re-aranging register order Other -> on the other re-aranging register order -> rearranging the register order src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 562: > 560: // Other levels, shuffles can be done by re-aranging register order > 561: > 562: // Four batches of 8 registers each, 128 bytes appart appart -> apart src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 701: > 699: // In each of these iterations half of the coefficients are added to and > 700: // subtracted from the other half of the coefficients then the result of > 701: // the substration is (Montgomery) multiplied by the corresponding zetas. substration -> subtraction (I know this was in my own comment :-( ) src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 850: > 848: } > 849: > 850: // Four batches of 8 registers each, 128 bytes appart appart -> apart ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3571728756 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556771999 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556825899 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556836110 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556839540 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556845331 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556853907 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556865521 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556913637 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556915972 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556943987 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556925142 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556945036 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556949814 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556953155 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556942168 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556956323 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556978873 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2556961642 From duke at openjdk.org Mon Nov 24 16:50:40 2025 From: duke at openjdk.org (duke) Date: Mon, 24 Nov 2025 16:50:40 GMT Subject: RFR: 8371458: [REDO] Remove exception handler stub code in C2 [v4] In-Reply-To: References: Message-ID: <9kFgZAXhct0uyRPPomMDRX4VXd2Xd2GhosMyIuDb4qM=.d6caf7cd-8360-4aef-90ee-0029c12f0f32@github.com> On Thu, 20 Nov 2025 23:56:48 GMT, Ruben wrote: >> The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. >> >> The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. >> >> According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. > > Ruben has updated the pull request incrementally with one additional commit since the last revision: > > Refine `first_check_size` definitions @ruben-arm Your change (at version 00ea0e143da03016531e301db673b1893cf9be64) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28192#issuecomment-3571760781 From duke at openjdk.org Mon Nov 24 17:02:07 2025 From: duke at openjdk.org (Ruben) Date: Mon, 24 Nov 2025 17:02:07 GMT Subject: Integrated: 8371458: [REDO] Remove exception handler stub code in C2 In-Reply-To: References: Message-ID: <4DuLwlcmvr3XcIxJnZPbuDumV2OiLwGZzM3N-zkS5-E=.16a23abc-f041-4321-8638-ee18429573c2@github.com> On Fri, 7 Nov 2025 11:07:40 GMT, Ruben wrote: > The original fix [JDK-8365047](https://bugs.openjdk.org/browse/JDK-8365047) was backed out by [JDK-8371388](https://bugs.openjdk.org/browse/JDK-8371388), this is the REDO. > > The C2 exception handler stub code is only a trampoline to the generated exception handler blob. This change removes the extra step on the way to the generated blob. > > According to some comments in the source code, the exception handler stub code used to be patched upon deoptimization, however presumably these comments are outdated as the patching upon deoptimization happens for post-call NOPs only. This pull request has now been integrated. Changeset: 21772600 Author: Ruben Ayrapetyan Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/217726009492af5a1143c98b97cc39b580850c5d Stats: 640 lines in 46 files changed: 334 ins; 218 del; 88 mod 8371458: [REDO] Remove exception handler stub code in C2 Co-authored-by: Martin Doerr Reviewed-by: mdoerr, dlong ------------- PR: https://git.openjdk.org/jdk/pull/28192 From vpaprotski at openjdk.org Mon Nov 24 17:19:12 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 24 Nov 2025 17:19:12 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 16:28:44 GMT, Mark Powers wrote: > SignatureBench.MLDSA with `+UseDilithiumIntrinsics` shows an average 1.61% improvement across all algorithms and data sizes. Measuring SignatureBench.MLDSA against a baseline build without the fix, shows an average 2.24% improvement across all algorithms and data sizes. Need bit of clarification.. (I think you are saying there is a regression?). - `+UseDilithiumIntrinsics` should be redundant (i.e. `vm_version_x86.cpp` should automatically detect and turn the feature on). - So if I read correctly.. the baseline measured is already has the original intrinsics (implicitly) enabled.. - therefore there is a 2.24% noise in the benchmark? In my measurements for AVX512 parts, I had seen between 0%->6% across `SignatureBench.MLDSA` - (some variation on desktop-vs-server parts..) - `SignatureBench.MLDSA.verify` was worse, only 0->2% depending on keysize (iirc, bigger portion of benchmark was in SHA3 instead) - `SignatureBench.MLDSA.sign` was better, 4-6% (also depending on datasize) That is also why I had included the other (deleted) microbenchmark.. `SignatureBench.MLDSA` has a lot of 'other things' (e.g. SHA3) also happening, so the AVX512 intrinsic changes were harder to differentiate from noise.. - I had measured ~25%-50% improvement on purely the 5 intrinsics changed.. Hence the claim 'never worse'.. A more precise claim..: - "New intrinsics seem to be better, but (at least for AVX512) existing intrinsics were already plenty good for MLDSA" ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3571871477 From eastigeevich at openjdk.org Mon Nov 24 17:49:27 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 24 Nov 2025 17:49:27 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 16:30:23 GMT, Evgeny Astigeevich wrote: >> src/hotspot/cpu/aarch64/icache_aarch64.hpp line 63: >> >>> 61: // the performance impact due to this workaround." >>> 62: // >>> 63: // As the address for icache invalidation is not relevant, we use the nmethod's code start address. >> >>> As the address for icache invalidation is not relevant >> >> Is this only because of the Neoverse-N1 workaround? >> >> _If that is the case we could reach this point if `NeoverseN1Errata1542419` is either set by the user or mislabeled on some CPU without this workaround._ > >> Is this only because of the Neoverse-N1 workaround? > > Yes, it is. > >> If that is the case we could reach this point if NeoverseN1Errata1542419 is either set by the user or mislabeled on some CPU without this workaround. > > We only set NeoverseN1Errata1542419 to true if CPU is Neoverse N1 with the errata and it is not set by an user. We rely on Linux kernel cpuinfo correctly providing us information about Neoverse N1 revision. I think it's worth to check explicitly all affected revisions. This will mitigate Linux kernels not correctly setting revisions. > We can issue a warning if an user sets it to true and CPU is not Neoverse N1 with the errata. I added explicit checks of Neoverse N1 revisions affected by the errata. Also a warning will be issued if NeoverseN1Errata1542419 is set for CPU not having the errata. I don't exit VM because an user can run some experiments using this option. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2557189615 From eastigeevich at openjdk.org Mon Nov 24 17:49:24 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 24 Nov 2025 17:49:24 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v3] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Explicitly check Neoverse N1 revisions affected by errata 1542419 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/b60317e9..20480771 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=01-02 Stats: 21 lines in 1 file changed: 12 ins; 3 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From mpowers at openjdk.org Mon Nov 24 17:57:44 2025 From: mpowers at openjdk.org (Mark Powers) Date: Mon, 24 Nov 2025 17:57:44 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:55:07 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > next set of comments The 2.24% improvement is the difference between `+UseDilithiumIntrinsics` and `-UseDilithiumIntrinsics.` I just repeated the testing that you documented in the description section of this PR on a different machine. My baseline is simply a build without your changes. I compared this with a build containing your changes and see a 2.24% improvement. Verification showed the least amount of improvement (same as what you observed). "never worse" is just my way of saying "always faster". ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3572037049 From coleenp at openjdk.org Mon Nov 24 19:04:42 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 24 Nov 2025 19:04:42 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v18] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 08:36:02 GMT, Johan Sj?len wrote: >> Hi, >> >> This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. >> >> We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. >> >> For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. >> >> On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. >> >> Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > IDE doesn't help you with VM structs! Looks good! This is great work. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27198#pullrequestreview-3501798483 From dfuchs at openjdk.org Mon Nov 24 19:29:15 2025 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Mon, 24 Nov 2025 19:29:15 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v3] In-Reply-To: <15AReOBUAseO-BiCWHW7N-OSOcknDc0Box3c90cXRZU=.5d7341db-94ea-4cdf-b3cd-fabe414dd88d@github.com> References: <15AReOBUAseO-BiCWHW7N-OSOcknDc0Box3c90cXRZU=.5d7341db-94ea-4cdf-b3cd-fabe414dd88d@github.com> Message-ID: <_SMDMjDoXuDI_Sujt62HD_YewzTQQlvqMSkpffJKq3A=.64a03981-30d1-48e5-a767-d4121c617296@github.com> On Thu, 13 Nov 2025 09:27:02 GMT, Jatin Bhateja wrote: >>> > > Some quick comments. >>> > > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. >>> > >>> > >>> > I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. >>> >>> There are nomenclature issues that I am facing. Currently, all the Float16 concrete classes use the Halffloat prefix i.e., Halffloat64Vector, Halffloat128Vector; converting these to Float16 looks a little confusing, i.e., Float1664Vector, Float16128Vector, etc Kindly suggest a better name to represent these classes. >> >> Maybe we move the shape to the end e.g., `Float16Vector128`, `IntVector128`, `IntVectorMax`? > >> > > > Some quick comments. >> > > > We should be consistent in the naming, and rename `Halfloat*` to `Float16*`. >> > > >> > > >> > > I concur, especially since there are multiple 16-bit floating-point formats in use including the IEEE 754 float16 as well as bfloat16. >> > >> > >> > There are nomenclature issues that I am facing. Currently, all the Float16 concrete classes use the Halffloat prefix i.e., Halffloat64Vector, Halffloat128Vector; converting these to Float16 looks a little confusing, i.e., Float1664Vector, Float16128Vector, etc Kindly suggest a better name to represent these classes. >> >> Maybe we move the shape to the end e.g., `Float16Vector128`, `IntVector128`, `IntVectorMax`? > > This looks good, since all these are concrete vector classes not exposed to users. @jatin-bhateja it looks like you should be merging latest changes from master; Some changes shown in the diff obviously do not belong to this fix: https://github.com/openjdk/jdk/pull/28002/files#diff-7798f606ce2bbf96fd99999c8c0ef9a4bb0455c128dd7e1249dea8db23d35402 Hopefully merging latest changes from master will make them go away? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3571013379 From jbhateja at openjdk.org Mon Nov 24 19:29:14 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 24 Nov 2025 19:29:14 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v3] In-Reply-To: References: Message-ID: <6ma1bZs5YmEe_PtNmR69pVoJ_YAWy5fUQrsnnk8nH9M=.0594b623-f494-4af3-8e1c-f88120c53aca@github.com> > Add a new Float16lVector type and corresponding concrete vector classes, in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. > - Add necessary inline expander support. > - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. > - Use existing Float16 vector IR and backend support. > - Extended the existing VectorAPI JTREG test suite for the newly added Float16Vector operations. > > The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). > > The following are the performance numbers for some of the selected Float16Vector benchmarking kernels compared to equivalent auto-vectorized Float16OperationsBenchmark kernels. > > image > > Initial RFP[1] was floated on the panama-dev mailing list. > > Kindly review the draft PR and share your feedback. > > Best Regards, > Jatin > > [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8370691 - Cleanups - Adding support for custom basic type T_FLOAT16, passing BasicType lane types to inline expander entries - Cleaning up interface as per review suggestions - Some cleanups - Fix some JTREG failures - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8370691 - Revamped JTreg test generation and bug fixes - Cleanups - Removing redundant warmup constraint - ... and 5 more: https://git.openjdk.org/jdk/compare/8bafc2f0...f34d324f ------------- Changes: https://git.openjdk.org/jdk/pull/28002/files Webrev: Webrev is not available because diff is too large Stats: 509516 lines in 232 files changed: 281237 ins; 226539 del; 1740 mod Patch: https://git.openjdk.org/jdk/pull/28002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28002/head:pull/28002 PR: https://git.openjdk.org/jdk/pull/28002 From jbhateja at openjdk.org Mon Nov 24 19:29:17 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 24 Nov 2025 19:29:17 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v3] In-Reply-To: References: <8hStIcvp252Ik7raxZL5BvFKKkXTflorjyOD9Cyakvc=.c5d1b302-5c49-46b1-91ba-2feda2e6a746@github.com> Message-ID: On Thu, 13 Nov 2025 19:47:52 GMT, Paul Sandoz wrote: >>> The basic type codes are declared and shared across Java and HotSpot - it's used in `LaneType`. Can we pass a single argument that is the basic type instead of two arguments. HotSpot should know from the basic type what the carrier class and also what the operation type without it being explicitly told, since presumably it knew the inverse - the basic type from the element class. >> >> Hi @PaulSandoz, T_HALFFLOAT used in LaneType is mainly used for differentiation of various cache keys used by conversion operation lookups. In principle, we can extend VM to acknowledge this new custom basic type on the lines of T_METADATA / T_ADDRESS; its scope for now will be restricted to VectorSupport. We can gradually expose this to C2 type, such that TypeVect for all Float16 VectorIR uses T_HALFFLOAT as its basic type; currently, we use T_SHORT as the lane type. Let me know if this looks reasonable > >> > The basic type codes are declared and shared across Java and HotSpot - it's used in `LaneType`. Can we pass a single argument that is the basic type instead of two arguments. HotSpot should know from the basic type what the carrier class and also what the operation type without it being explicitly told, since presumably it knew the inverse - the basic type from the element class. >> >> Hi @PaulSandoz, T_HALFFLOAT used in LaneType is mainly used for differentiation of various cache keys used by conversion operation lookups. In principle, we can extend VM to acknowledge this new custom basic type on the lines of T_METADATA / T_ADDRESS; its scope for now will be restricted to VectorSupport. We can gradually expose this to C2 type, such that TypeVect for all Float16 VectorIR uses T_HALFFLOAT as its basic type; currently, we use T_SHORT as the lane type. Let me know if this looks reasonable > > I am proposing something simpler, really as a temporary step until `Float16` becomes part of the `java.base` module. IIUC from the basic type we can reliably determine what the two arguments we currently passing are e.g., T_HALFFLOAT = { short.class, VECTOR_TYPE_FP16 }. So we don't need to pass two arguments, we can just pass one, the intrinsic can lookup the class and operation type kind. Hi @PaulSandoz, I have addressed your comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-3572377706 From sparasa at openjdk.org Mon Nov 24 19:44:57 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 24 Nov 2025 19:44:57 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v5] In-Reply-To: References: Message-ID: <3uO6m_WnNNyLIQSUL6-mmNSEmzJoka74zHSuURlhx7k=.b5bc45b7-7b44-41d5-96d8-366849aaaff7@github.com> > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.263 > 2 | 0.46 | 0.16 | 0.264 > 5 | 0.46 | 0.29 | 0.299 > 10 | 0.46 | 0.58 | 0.303 > 15 | 0.46 | 0.42 | 0.271 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.299 > 20 | 0.21 | 0.37 | 0.299 > 25 | 0.21 | 0.59 | 0.282 > 31 | 0.21 | 0.53 | 0.273 > 32 | 0.21 | 0.58 | 0.199 > 35 | 0.5 | 0.77 | 0.259 > 40 | 0.5 | 0.61 | 0.33 > 45 | 0.5 | 0.52 | 0.281 > 48 | 0.5 | 0.66 | 0.32 > 49 | 0.22 | 0.69 | 0.3 > 50 | 0.22 | 0.78 | 0.3 > 55 | 0.22 | 0.67 | 0.292 > 60 | 0.22 | 0.67 | 0.3293 > 64 | 0.22 | 0.82 | 0.23 > 70 | 0.51 | 1.1 | 0.34 > 80 | 0.49 | 0.89 | 0.365 > 90 | 0.225 | 0.68 | 0.33 > 100 | 0.54 | 1.09 | 0.347 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.289 Srinivas Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' of https://git.openjdk.java.net/jdk into fill_array - undo jccb to jcc change as needed - refactor code to use fill32_tail at the end of the stub - undo size check for fill64_masked - 8349452: Fix performance regression for Arrays.fill() with AVX512 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/57dc6c4a..fd8b6c21 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=03-04 Stats: 6836 lines in 227 files changed: 4452 ins; 1415 del; 969 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From sparasa at openjdk.org Mon Nov 24 19:50:54 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 24 Nov 2025 19:50:54 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v6] In-Reply-To: References: Message-ID: > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.185 > 2 | 0.46 | 0.16 | 0.195 > 3 | 0.46 | 0.176 | 0.199 > 4 | 0.46 | 0.244 | 0.207 > 5 | 0.46 | 0.29 | 0.32 > 10 | 0.46 | 0.58 | 0.303 > 15 | 0.46 | 0.42 | 0.271 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.299 > 20 | 0.21 | 0.37 | 0.299 > 25 | 0.21 | 0.59 | 0.282 > 31 | 0.21 | 0.53 | 0.273 > 32 | 0.21 | 0.58 | 0.199 > 35 | 0.5 | 0.77 | 0.259 > 40 | 0.5 | 0.61 | 0.33 > 45 | 0.5 | 0.52 | 0.281 > 48 | 0.5 | 0.66 | 0.32 > 49 | 0.22 | 0.69 | 0.3 > 50 | 0.22 | 0.78 | 0.3 > 55 | 0.22 | 0.67 | 0.292 > 60 | 0.22 | 0.67 | 0.3293 > 64 | 0.22 | 0.82 | 0.23 > 70 | 0.51 | 1.1 | 0.34 > 80 | 0.49 | 0.89 | 0.365 > 90 | 0.225 | 0.68 | 0.33 > 100 | 0.54 | 1.09 | 0.347 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.289 Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: fastpath for size <= 4 bytes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/fd8b6c21..92ca9b92 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=04-05 Stats: 19 lines in 1 file changed: 19 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From sparasa at openjdk.org Mon Nov 24 20:23:22 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 24 Nov 2025 20:23:22 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v7] In-Reply-To: References: Message-ID: <-G_HWtFjmZ88xnJ8ScA-TbTeNlTdA_89HuRYFA78afk=.c947dbb4-b35b-4a1e-804d-689c17135576@github.com> > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.185 > 2 | 0.46 | 0.16 | 0.195 > 3 | 0.46 | 0.176 | 0.199 > 4 | 0.46 | 0.244 | 0.207 > 5 | 0.46 | 0.29 | 0.32 > 10 | 0.46 | 0.58 | 0.303 > 15 | 0.46 | 0.42 | 0.271 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.299 > 20 | 0.21 | 0.37 | 0.299 > 25 | 0.21 | 0.59 | 0.282 > 31 | 0.21 | 0.53 | 0.273 > 32 | 0.21 | 0.58 | 0.199 > 35 | 0.5 | 0.77 | 0.259 > 40 | 0.5 | 0.61 | 0.33 > 45 | 0.5 | 0.52 | 0.281 > 48 | 0.5 | 0.66 | 0.32 > 49 | 0.22 | 0.69 | 0.3 > 50 | 0.22 | 0.78 | 0.3 > 55 | 0.22 | 0.67 | 0.292 > 60 | 0.22 | 0.67 | 0.3293 > 64 | 0.22 | 0.82 | 0.23 > 70 | 0.51 | 1.1 | 0.34 > 80 | 0.49 | 0.89 | 0.365 > 90 | 0.225 | 0.68 | 0.33 > 100 | 0.54 | 1.09 | 0.347 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.289 Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: remove all masked stores altogether ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/92ca9b92..55e77c65 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=05-06 Stats: 52 lines in 2 files changed: 0 ins; 46 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From sparasa at openjdk.org Mon Nov 24 20:26:20 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 24 Nov 2025 20:26:20 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 22:03:37 GMT, Sandhya Viswanathan wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> undo size check for fill64_masked > > The pre-submit test seem to be unrelated to the PR changes. A fresh merge with tip might resolve those. @sviswa7: Please see the new changes: (1) fresh merge with master (2) removed all other uses of masked stores (fill32_masked, fill64_masked) in other places (3) added a fastpath for byte array fill size <=4. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28442#issuecomment-3572579136 From vpaprotski at openjdk.org Mon Nov 24 20:52:43 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 24 Nov 2025 20:52:43 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v4] In-Reply-To: References: Message-ID: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" Volodymyr Paprotski has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into avx2-ntt - next set of comments - whitespace - address first comments - Merge remote-tracking branch 'origin/master' into avx2-ntt - add copyright, whitespace and test jtreg tags - Fixes and comments from Anas - AVX2 and AVX512 intrinsics for MLDSA ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28136/files - new: https://git.openjdk.org/jdk/pull/28136/files/b04f4f0d..cefa021a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=02-03 Stats: 242832 lines in 2033 files changed: 165193 ins; 41903 del; 35736 mod Patch: https://git.openjdk.org/jdk/pull/28136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28136/head:pull/28136 PR: https://git.openjdk.org/jdk/pull/28136 From vpaprotski at openjdk.org Mon Nov 24 20:52:45 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 24 Nov 2025 20:52:45 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:55:07 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > next set of comments @mcpowers Thanks for tests! @ferakocz thanks for the review! I think I took them all in, except for the montMul comment section.. Not quite what I meant so tried to reword.. see if it helps any? ------------- PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3502030137 From vpaprotski at openjdk.org Mon Nov 24 20:53:01 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 24 Nov 2025 20:53:01 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 15:35:12 GMT, Ferenc Rakoczi wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> next set of comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 88: > >> 86: // +-----+-----+-----+-----+----- >> 87: // >> 88: // NOTE: size 0 and 1 are used for initial and final shuffles respectivelly of > > Typo: respectivelly -> respectively done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 248: > >> 246: // We do Montgomery multiplications of two AVX registers in 4 steps: >> 247: // 1. Do the multiplications of the corresponding even numbered slots into >> 248: // the odd numbered slots of a scratch2 register. > > Typo: scratch2 -> scratch I think I meant "the scratch2" register here.. reworded, please double check if its clearer.. > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 250: > >> 248: // the odd numbered slots of a scratch2 register. >> 249: // 2. Swap the even and odd numbered slots of the original input registers.* >> 250: // 3. Similar to step 1, but into output register. > > Typo: into output register -> into an output register used 'the' to be 'specific'.. (I think the lack of articles was causing the confusion.. "the scratch2 register is combined with the output register into scratch.. or something..) Also reworded step 4? > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 253: > >> 251: // 4. Combine the outputs of step 1 and step 3 into the output of the Montgomery >> 252: // multiplication. >> 253: // (*For levels 0-6 in the Ntt and levels 1-7 of the inverse Ntt, need NOT swap > > Typo: unnecessary '(*' at the beginning This was my attempt to add a note to second step.. spelled out "Note"? or can just remove, since swapping only happens on second step.. > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 282: > >> 280: const XMMRegister* scratch = scratch1 == input1 ? output: scratch1; >> 281: >> 282: // scratch = input1_even*intput2_even > > Suggestion: // scratch = input1_even * intput2_even done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 479: > >> 477: // level 0 - 128 >> 478: // scratch1 = coeffs3 * zetas1 >> 479: // coeffs3, coeffs1 = coeffs1?scratch1 > > Suggestion: // coeffs3, coeffs1 = coeffs1 ? scratch1 done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 524: > >> 522: // coeffs1_2 = coeffs1_2 + scratch1 >> 523: loadXmms(Zetas3, zetas, level * 512, vector_len, _masm); >> 524: shuffle(Scratch1, Coeffs1_2, Coeffs2_2, distance * 32); //Coeffs2_2 freed > > Suggestion: // Coeffs2_2 freed done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 529: > >> 527: >> 528: loadXmms(Zetas3, zetas, 4*64 + level * 512, vector_len, _masm); >> 529: shuffle(Scratch1, Coeffs3_2, Coeffs4_2, distance * 32); //Coeffs4_2 freed > > Suggestion: // Coeffs4_2 freed done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 554: > >> 552: const XMMRegister Coeffs2_2[] = {xmm4, xmm5, xmm6, xmm7}; >> 553: >> 554: // Since we cannot fit the entire payload into registers, we process > > process input -> process the input done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 555: > >> 553: >> 554: // Since we cannot fit the entire payload into registers, we process >> 555: // input in two stages. First half, load 8 registers 32 integers each apart. > > First half -> For the first half done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 557: > >> 555: // input in two stages. First half, load 8 registers 32 integers each apart. >> 556: // With one load, we can process level 0-2 (128-, 64- and 32-integers apart) >> 557: // Remaining levels, load 8 registers from consecutive memory (16-, 8-, 4-, > > Remaining -> For the remaining done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 558: > >> 556: // With one load, we can process level 0-2 (128-, 64- and 32-integers apart) >> 557: // Remaining levels, load 8 registers from consecutive memory (16-, 8-, 4-, >> 558: // 2-, 1-integer appart) > > appart -> apart Thanks! Looks like I've always misspelled that word! :) > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 559: > >> 557: // Remaining levels, load 8 registers from consecutive memory (16-, 8-, 4-, >> 558: // 2-, 1-integer appart) >> 559: // Levels 5, 6, 7 (4-, 2-, 1-integer appart) require shuffles within registers > > appart -> apart done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 560: > >> 558: // 2-, 1-integer appart) >> 559: // Levels 5, 6, 7 (4-, 2-, 1-integer appart) require shuffles within registers >> 560: // Other levels, shuffles can be done by re-aranging register order > > Other -> on the other > re-aranging register order -> rearranging the register order done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 562: > >> 560: // Other levels, shuffles can be done by re-aranging register order >> 561: >> 562: // Four batches of 8 registers each, 128 bytes appart > > appart -> apart done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 701: > >> 699: // In each of these iterations half of the coefficients are added to and >> 700: // subtracted from the other half of the coefficients then the result of >> 701: // the substration is (Montgomery) multiplied by the corresponding zetas. > > substration -> subtraction (I know this was in my own comment :-( ) done (funny, thats exactly how I say "substraction" in my head too :D ) > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 850: > >> 848: } >> 849: >> 850: // Four batches of 8 registers each, 128 bytes appart > > appart -> apart done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557555908 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557577559 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557589525 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557582314 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557592866 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557595337 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557596482 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557596698 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557599194 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557606458 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557606672 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557608631 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557611103 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557611341 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557616181 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557620647 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2557621206 From sviswanathan at openjdk.org Mon Nov 24 21:01:34 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 24 Nov 2025 21:01:34 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v7] In-Reply-To: <-G_HWtFjmZ88xnJ8ScA-TbTeNlTdA_89HuRYFA78afk=.c947dbb4-b35b-4a1e-804d-689c17135576@github.com> References: <-G_HWtFjmZ88xnJ8ScA-TbTeNlTdA_89HuRYFA78afk=.c947dbb4-b35b-4a1e-804d-689c17135576@github.com> Message-ID: <01tkIHs6ObmOGese9AZQeVc6DrZecc19jwaU08i0jm4=.6193430a-a7ef-4f44-8fe4-2b5d744c0b13@github.com> On Mon, 24 Nov 2025 20:23:22 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. >> >> To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. >> >> >> ### **Performance comparison for byte array fills in a loop for 1 million times** >> >> >> UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] >> -- | -- | -- | -- >> 1 | 0.46 | 0.14 | 0.185 >> 2 | 0.46 | 0.16 | 0.195 >> 3 | 0.46 | 0.176 | 0.199 >> 4 | 0.46 | 0.244 | 0.207 >> 5 | 0.46 | 0.29 | 0.32 >> 10 | 0.46 | 0.58 | 0.303 >> 15 | 0.46 | 0.42 | 0.271 >> 16 | 0.46 | 0.46 | 0.32 >> 17 | 0.21 | 0.5 | 0.299 >> 20 | 0.21 | 0.37 | 0.299 >> 25 | 0.21 | 0.59 | 0.282 >> 31 | 0.21 | 0.53 | 0.273 >> 32 | 0.21 | 0.58 | 0.199 >> 35 | 0.5 | 0.77 | 0.259 >> 40 | 0.5 | 0.61 | 0.33 >> 45 | 0.5 | 0.52 | 0.281 >> 48 | 0.5 | 0.66 | 0.32 >> 49 | 0.22 | 0.69 | 0.3 >> 50 | 0.22 | 0.78 | 0.3 >> 55 | 0.22 | 0.67 | 0.292 >> 60 | 0.22 | 0.67 | 0.3293 >> 64 | 0.22 | 0.82 | 0.23 >> 70 | 0.51 | 1.1 | 0.34 >> 80 | 0.49 | 0.89 | 0.365 >> 90 | 0.225 | 0.68 | 0.33 >> 100 | 0.54 | 1.09 | 0.347 >> 110 | 0.6 | 0.98 | 0.36 >> 120 | 0.26 | 0.75 | 0.386 >> 128 | 0.266 | 1.1 | 0.289 > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > remove all masked stores altogether src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5642: > 5640: BIND(L_tail); > 5641: addptr(cnt, 4); > 5642: jcc(Assembler::lessEqual, L_end); This also might work with jccb. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5768: > 5766: > 5767: decrement(cnt); > 5768: jcc(Assembler::negative, DONE); // Zero length This could remain as jccb. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2557679609 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2557676057 From eosterlund at openjdk.org Mon Nov 24 21:00:30 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 24 Nov 2025 21:00:30 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 17:49:24 GMT, Evgeny Astigeevich wrote: >> Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. >> >> Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: >> - Disable coherent icache. >> - Trap IC IVAU instructions. >> - Execute: >> - `tlbi vae3is, xzr` >> - `dsb sy` >> >> `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. >> >> As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: >> >> "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." >> >> This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. >> >> Changes include: >> >> * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. >> * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. >> * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. >> * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. >> >> Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) >> >> - Baseline >> >> $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1... > > Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: > > Explicitly check Neoverse N1 revisions affected by errata 1542419 Yeah patching all nmethods as one unit is basically equivalent to making the code cache processing a STW operation. Last time we processed the code cache STW was JDK 11. A dark place I don't want to go back to. It can get pretty big and mess up latency. So I'm in favour of limiting the fix and not re-introduce STW code cache processing. Otherwise yes you are correct; we perform synchronous cross modifying code with no assumptions about instruction cache coherency because we didn't trust it would actually work for all ARM implementations. Seems like that was a good bet. We rely on it on x64 still though. It's a bit surprising to me if they invalidate all TLB entries, effectively ripping out the entire virtual address space, even when a range is passed in. If so, a horrible alternative might be to use mprotect to temporarily remove execution permission on the affected per nmethod pages, and detect over shooting in the signal handler, resuming execution when execution privileges are then restored immediately after. That should limit the affected VA to close to what is actually invalidated. But it would look horrible. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3572695548 From ascarpino at openjdk.org Mon Nov 24 21:04:35 2025 From: ascarpino at openjdk.org (Anthony Scarpino) Date: Mon, 24 Nov 2025 21:04:35 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:55:07 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > next set of comments Marked as reviewed by ascarpino (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3502221580 From sparasa at openjdk.org Mon Nov 24 21:13:39 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 24 Nov 2025 21:13:39 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v7] In-Reply-To: <01tkIHs6ObmOGese9AZQeVc6DrZecc19jwaU08i0jm4=.6193430a-a7ef-4f44-8fe4-2b5d744c0b13@github.com> References: <-G_HWtFjmZ88xnJ8ScA-TbTeNlTdA_89HuRYFA78afk=.c947dbb4-b35b-4a1e-804d-689c17135576@github.com> <01tkIHs6ObmOGese9AZQeVc6DrZecc19jwaU08i0jm4=.6193430a-a7ef-4f44-8fe4-2b5d744c0b13@github.com> Message-ID: On Mon, 24 Nov 2025 20:46:37 GMT, Sandhya Viswanathan wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> remove all masked stores altogether > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5642: > >> 5640: BIND(L_tail); >> 5641: addptr(cnt, 4); >> 5642: jcc(Assembler::lessEqual, L_end); > > This also might work with jccb. Verified that jcc at 5642 is needed. The jcc at 5624 can be reverted to jccb. > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5768: > >> 5766: >> 5767: decrement(cnt); >> 5768: jcc(Assembler::negative, DONE); // Zero length > > This could remain as jccb. Verified that this jccb to jcc change is needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2557724635 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2557727935 From sparasa at openjdk.org Mon Nov 24 21:13:42 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 24 Nov 2025 21:13:42 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v7] In-Reply-To: <-G_HWtFjmZ88xnJ8ScA-TbTeNlTdA_89HuRYFA78afk=.c947dbb4-b35b-4a1e-804d-689c17135576@github.com> References: <-G_HWtFjmZ88xnJ8ScA-TbTeNlTdA_89HuRYFA78afk=.c947dbb4-b35b-4a1e-804d-689c17135576@github.com> Message-ID: On Mon, 24 Nov 2025 20:23:22 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. >> >> To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. >> >> >> ### **Performance comparison for byte array fills in a loop for 1 million times** >> >> >> UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] >> -- | -- | -- | -- >> 1 | 0.46 | 0.14 | 0.185 >> 2 | 0.46 | 0.16 | 0.195 >> 3 | 0.46 | 0.176 | 0.199 >> 4 | 0.46 | 0.244 | 0.207 >> 5 | 0.46 | 0.29 | 0.32 >> 10 | 0.46 | 0.58 | 0.303 >> 15 | 0.46 | 0.42 | 0.271 >> 16 | 0.46 | 0.46 | 0.32 >> 17 | 0.21 | 0.5 | 0.299 >> 20 | 0.21 | 0.37 | 0.299 >> 25 | 0.21 | 0.59 | 0.282 >> 31 | 0.21 | 0.53 | 0.273 >> 32 | 0.21 | 0.58 | 0.199 >> 35 | 0.5 | 0.77 | 0.259 >> 40 | 0.5 | 0.61 | 0.33 >> 45 | 0.5 | 0.52 | 0.281 >> 48 | 0.5 | 0.66 | 0.32 >> 49 | 0.22 | 0.69 | 0.3 >> 50 | 0.22 | 0.78 | 0.3 >> 55 | 0.22 | 0.67 | 0.292 >> 60 | 0.22 | 0.67 | 0.3293 >> 64 | 0.22 | 0.82 | 0.23 >> 70 | 0.51 | 1.1 | 0.34 >> 80 | 0.49 | 0.89 | 0.365 >> 90 | 0.225 | 0.68 | 0.33 >> 100 | 0.54 | 1.09 | 0.347 >> 110 | 0.6 | 0.98 | 0.36 >> 120 | 0.26 | 0.75 | 0.386 >> 128 | 0.266 | 1.1 | 0.289 > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > remove all masked stores altogether src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5775: > 5773: decrement(cnt); > 5774: jccb(Assembler::greaterEqual, LOOP); > 5775: jmp(DONE); Verified that this jmpb to jmp change is needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2557733111 From vpaprotski at openjdk.org Mon Nov 24 21:16:03 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 24 Nov 2025 21:16:03 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v5] In-Reply-To: References: Message-ID: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: comments from Ferenc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28136/files - new: https://git.openjdk.org/jdk/pull/28136/files/cefa021a..691e1dfc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=03-04 Stats: 23 lines in 1 file changed: 0 ins; 0 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/28136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28136/head:pull/28136 PR: https://git.openjdk.org/jdk/pull/28136 From sparasa at openjdk.org Mon Nov 24 21:19:26 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 24 Nov 2025 21:19:26 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v8] In-Reply-To: References: Message-ID: > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.185 > 2 | 0.46 | 0.16 | 0.195 > 3 | 0.46 | 0.176 | 0.199 > 4 | 0.46 | 0.244 | 0.207 > 5 | 0.46 | 0.29 | 0.32 > 10 | 0.46 | 0.58 | 0.303 > 15 | 0.46 | 0.42 | 0.271 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.299 > 20 | 0.21 | 0.37 | 0.299 > 25 | 0.21 | 0.59 | 0.282 > 31 | 0.21 | 0.53 | 0.273 > 32 | 0.21 | 0.58 | 0.199 > 35 | 0.5 | 0.77 | 0.259 > 40 | 0.5 | 0.61 | 0.33 > 45 | 0.5 | 0.52 | 0.281 > 48 | 0.5 | 0.66 | 0.32 > 49 | 0.22 | 0.69 | 0.3 > 50 | 0.22 | 0.78 | 0.3 > 55 | 0.22 | 0.67 | 0.292 > 60 | 0.22 | 0.67 | 0.3293 > 64 | 0.22 | 0.82 | 0.23 > 70 | 0.51 | 1.1 | 0.34 > 80 | 0.49 | 0.89 | 0.365 > 90 | 0.225 | 0.68 | 0.33 > 100 | 0.54 | 1.09 | 0.347 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.289 Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: revert to jccb in one place ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/55e77c65..b047ac84 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From dholmes at openjdk.org Mon Nov 24 21:42:51 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 24 Nov 2025 21:42:51 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: References: Message-ID: <4de-X2_ShDvpkpTSDDXlXq3nDsliA8QMAd6HvzS-VG4=.060f3a39-23f8-4a6a-ab3a-7aaa8458768e@github.com> On Mon, 24 Nov 2025 08:52:35 GMT, Aleksey Shipilev wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix include order > > Looks reasonable, but I have questions: Thanks for taking a look @shipilev ! > src/hotspot/share/compiler/compilationMemoryStatistic.cpp line 1013: > >> 1011: st->cr(); >> 1012: } >> 1013: if (Thread::current_or_null_safe() != nullptr) { > > Looks like `check_before_reporting()` does some pre-checks, maybe move it here, and print a helpful message about detached threads? I thought it better to print the initial message even if we can't then print any details. > src/hotspot/share/utilities/vmError.cpp line 667: > >> 665: if (MemTracker::enabled() && >> 666: NmtVirtualMemory_lock != nullptr && >> 667: _thread != nullptr && > > I do wonder if we want to do the change downstream to cover all these cases? > > > bool Mutex::owned_by_self() const { > - return owner() == Thread::current(); > + return owner() == Thread::current_or_null_safe(); > } Not really. The "safe" version is only needed if it can be called from a signal handling context. If we changed all of them then we effectively disable language/compiler based use of ThreadLocal! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28470#issuecomment-3572833444 PR Review Comment: https://git.openjdk.org/jdk/pull/28470#discussion_r2557794562 PR Review Comment: https://git.openjdk.org/jdk/pull/28470#discussion_r2557797435 From dholmes at openjdk.org Mon Nov 24 21:48:06 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 24 Nov 2025 21:48:06 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: <4de-X2_ShDvpkpTSDDXlXq3nDsliA8QMAd6HvzS-VG4=.060f3a39-23f8-4a6a-ab3a-7aaa8458768e@github.com> References: <4de-X2_ShDvpkpTSDDXlXq3nDsliA8QMAd6HvzS-VG4=.060f3a39-23f8-4a6a-ab3a-7aaa8458768e@github.com> Message-ID: On Mon, 24 Nov 2025 21:39:00 GMT, David Holmes wrote: >> src/hotspot/share/utilities/vmError.cpp line 667: >> >>> 665: if (MemTracker::enabled() && >>> 666: NmtVirtualMemory_lock != nullptr && >>> 667: _thread != nullptr && >> >> I do wonder if we want to do the change downstream to cover all these cases? >> >> >> bool Mutex::owned_by_self() const { >> - return owner() == Thread::current(); >> + return owner() == Thread::current_or_null_safe(); >> } > > Not really. The "safe" version is only needed if it can be called from a signal handling context. If we changed all of them then we effectively disable language/compiler based use of ThreadLocal! Sorry that was interpreting a broader use of `current_or_null_safe`. Just using it here would require: Thread* current = Thread::current_or_null_safe(); return current != nullptr && owner() == current; But I don't think this would generally be that helpful - the code doing the check needs to be null-aware. The current code gives us the assertion failure that tells us we have been called by an unattached thread. We can then adjust the calling code to address that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28470#discussion_r2557807994 From vpaprotski at openjdk.org Mon Nov 24 22:01:17 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Mon, 24 Nov 2025 22:01:17 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v6] In-Reply-To: References: Message-ID: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: spelling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28136/files - new: https://git.openjdk.org/jdk/pull/28136/files/691e1dfc..bfc16f1f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28136/head:pull/28136 PR: https://git.openjdk.org/jdk/pull/28136 From sviswanathan at openjdk.org Mon Nov 24 22:01:18 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 24 Nov 2025 22:01:18 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v6] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 21:56:32 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > spelling Marked as reviewed by sviswanathan (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3502367169 From duke at openjdk.org Mon Nov 24 22:03:39 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 24 Nov 2025 22:03:39 GMT Subject: RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 16:54:42 GMT, Vladimir Kozlov wrote: > May be we should change the assert to guarantee in Relocation::pd_set_call_destination() to make sure we catch incorrect patching it product VM. I'm not opposed to changing this. Is this the main concern? > Looking on `NativeCall::set_destination_mt_safe` and `reachable` is calculated based on distance between address of call instruction and destination. Which could be different for cloned nmethod. I'm not sure I understand what you're saying here. I agree the offset is most likely different after the nmethod is cloned. The offset gets fixed by `trampoline_stub_Relocation::fix_relocation_after_move` since it could be out of range. Since `CallRelocation::fix_relocation_after_move` sets the destination to whatever was passed (regardless of if it is in range or not) it does not make sense to call this on the relocated nmethod which is why we skip it. I believe `Relocation::pd_set_call_destination` for aarch64 could use `set_destination_mt_safe` instead of `set_destination` which was an alternative approach in the original PR. The original discussion is [here](https://github.com/openjdk/jdk/pull/23573#discussion_r2123618495). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2557844143 From sviswanathan at openjdk.org Mon Nov 24 22:54:58 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 24 Nov 2025 22:54:58 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v8] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 21:19:26 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. >> >> To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. >> >> >> ### **Performance comparison for byte array fills in a loop for 1 million times** >> >> >> UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] >> -- | -- | -- | -- >> 1 | 0.46 | 0.14 | 0.185 >> 2 | 0.46 | 0.16 | 0.195 >> 3 | 0.46 | 0.176 | 0.199 >> 4 | 0.46 | 0.244 | 0.207 >> 5 | 0.46 | 0.29 | 0.32 >> 10 | 0.46 | 0.58 | 0.303 >> 15 | 0.46 | 0.42 | 0.271 >> 16 | 0.46 | 0.46 | 0.32 >> 17 | 0.21 | 0.5 | 0.299 >> 20 | 0.21 | 0.37 | 0.299 >> 25 | 0.21 | 0.59 | 0.282 >> 31 | 0.21 | 0.53 | 0.273 >> 32 | 0.21 | 0.58 | 0.199 >> 35 | 0.5 | 0.77 | 0.259 >> 40 | 0.5 | 0.61 | 0.33 >> 45 | 0.5 | 0.52 | 0.281 >> 48 | 0.5 | 0.66 | 0.32 >> 49 | 0.22 | 0.69 | 0.3 >> 50 | 0.22 | 0.78 | 0.3 >> 55 | 0.22 | 0.67 | 0.292 >> 60 | 0.22 | 0.67 | 0.3293 >> 64 | 0.22 | 0.82 | 0.23 >> 70 | 0.51 | 1.1 | 0.34 >> 80 | 0.49 | 0.89 | 0.365 >> 90 | 0.225 | 0.68 | 0.33 >> 100 | 0.54 | 1.09 | 0.347 >> 110 | 0.6 | 0.98 | 0.36 >> 120 | 0.26 | 0.75 | 0.386 >> 128 | 0.266 | 1.1 | 0.289 > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > revert to jccb in one place src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5639: > 5637: addptr(base, 32); > 5638: subptr(cnt, 4); > 5639: The subtraction of the cnt is being done in fill64_tail so this should move to line 5635 in the else. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9266: > 9264: jcc(Assembler::zero, L_done); > 9265: movb(Address(dst, disp), temp); > 9266: Need subq(length, 1 >> shift) here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2557961204 PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2557953634 From vlivanov at openjdk.org Mon Nov 24 22:59:36 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 24 Nov 2025 22:59:36 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Fri, 21 Nov 2025 14:53:03 GMT, Coleen Phillimore wrote: >> ArrayKlass doesn't set AccessFlags so don't look for them there. See CR for details. >> Fixed SA and jvmci. @iwanowww Can you check that I changed C2 correctly (we talked about this in August). >> Tested with tier1-4. 5-7 in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Reformatting compile.cpp src/hotspot/share/opto/library_call.cpp line 4100: > 4098: // Other types can report the actual _super. > 4099: // (To verify this code sequence, check the asserts in JVM_IsInterface.) > 4100: if (generate_interface_guard(kls, region) != nullptr) BTW why did you decide to change the order of the checks? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2557971918 From missa at openjdk.org Mon Nov 24 23:18:54 2025 From: missa at openjdk.org (Mohamed Issa) Date: Mon, 24 Nov 2025 23:18:54 GMT Subject: RFR: 8368977: Provide clear naming for AVX10 identifiers Message-ID: <6XYgqaHA0PPZzvnfysKOP5XGP7e_RMkVFt9PV2OT3Gk=.e5f33072-a91a-4e57-99f3-81cc4ae4d844@github.com> This is a simple change that renames all AVX10 identifiers to explicitly show which sub-versions are in use. Right now, only AVX10.2 is the only case to worry about. The JTREG tests listed below were used to verify correctness with the recommended JVM options mentioned in corresponding source files. Each test included runs through emulation with AVX10.2 enabled and disabled to exercise all possible paths. All modifications and tests used [OpenJDK v26-b24](https://github.com/openjdk/jdk/releases/tag/jdk-26%2B24) as the baseline build. 1. `jtreg:test/hotspot/jtreg/compiler/codegen/TestByteDoubleVect.java` 2. `jtreg:test/hotspot/jtreg/compiler/codegen/TestByteFloatVect.java` 3. `jtreg:test/hotspot/jtreg/compiler/codegen/TestIntDoubleVect.java` 4. `jtreg:test/hotspot/jtreg/compiler/codegen/TestIntFloatVect.java` 5. `jtreg:test/hotspot/jtreg/compiler/codegen/TestLongDoubleVect.java` 6. `jtreg:test/hotspot/jtreg/compiler/codegen/TestLongFloatVect.java` 7. `jtreg:test/hotspot/jtreg/compiler/codegen/TestShortDoubleVect.java` 8. `jtreg:test/hotspot/jtreg/compiler/codegen/TestShortFloatVect.java` 9. `jtreg:test/hotspot/jtreg/compiler/floatingpoint/ScalarFPtoIntCastTest.java` 10. `jtreg:test/hotspot/jtreg/compiler/vectorapi/VectorFPtoIntCastTest.java` 11. `jtreg:test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java` 12. `jtreg:test/jdk/jdk/incubator/vector/Double64VectorTests.java` 13. `jtreg:test/jdk/jdk/incubator/vector/Double128VectorTests.java` 14. `jtreg:test/jdk/jdk/incubator/vector/Double256VectorTests.java` 15. `jtreg:test/jdk/jdk/incubator/vector/Double512VectorTests.java` 16. `jtreg:test/jdk/jdk/incubator/vector/DoubleMaxVectorTests.java` 17. `jtreg:test/jdk/jdk/incubator/vector/Float64VectorTests.java` 18. `jtreg:test/jdk/jdk/incubator/vector/Float128VectorTests.java` 19. `jtreg:test/jdk/jdk/incubator/vector/Float256VectorTests.java` 20. `jtreg:test/jdk/jdk/incubator/vector/Float512VectorTests.java` 21. `jtreg:test/jdk/jdk/incubator/vector/FloatMaxVectorTests.java` 22. `jtreg:test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java` 23. `jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java` ------------- Commit messages: - Fix naming issue in vector floating point cast test file - Rename AVX10 identifiers to AVX10_2 and use AVX10.2 in template table conversion whenever available Changes: https://git.openjdk.org/jdk/pull/28344/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28344&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8368977 Stats: 134 lines in 10 files changed: 16 ins; 0 del; 118 mod Patch: https://git.openjdk.org/jdk/pull/28344.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28344/head:pull/28344 PR: https://git.openjdk.org/jdk/pull/28344 From sviswanathan at openjdk.org Tue Nov 25 00:47:00 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 25 Nov 2025 00:47:00 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v8] In-Reply-To: References: Message-ID: <7a0HilK457B1qfmgnbgVGCQ1xpJ_ZHv7csHxdktjsdg=.86f04a56-a89c-4734-889b-3fa514f3e349@github.com> On Mon, 24 Nov 2025 22:51:42 GMT, Sandhya Viswanathan wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> revert to jccb in one place > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5639: > >> 5637: addptr(base, 32); >> 5638: subptr(cnt, 4); >> 5639: > > The subtraction of the cnt is being done in fill64_tail so this should move to line 5635 in the else. Please ignore this comment, didn't notice the jump to L_end at line 5626. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28442#discussion_r2558137228 From macarte at openjdk.org Tue Nov 25 01:04:37 2025 From: macarte at openjdk.org (Mat Carter) Date: Tue, 25 Nov 2025 01:04:37 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v4] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 21:48:41 GMT, Mark Reinhold wrote: >> Mat Carter has updated the pull request incrementally with one additional commit since the last revision: >> >> Updated test based on comments > > src/jdk.management/share/classes/jdk/management/HotSpotAOTCacheMXBean.java line 78: > >> 76: * specification of the corresponding JVM command-line options, please refer >> 77: * to https://openjdk.org/jeps/483 and https://openjdk.org/jeps/514. >> 78: * > > Please don't use bare URLs. Change these to > > ... please refer to JEPs 483 and > 514. fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2558167839 From dlong at openjdk.org Tue Nov 25 01:05:01 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 25 Nov 2025 01:05:01 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Mon, 24 Nov 2025 22:57:18 GMT, Vladimir Ivanov wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Reformatting compile.cpp > > src/hotspot/share/opto/library_call.cpp line 4100: > >> 4098: // Other types can report the actual _super. >> 4099: // (To verify this code sequence, check the asserts in JVM_IsInterface.) >> 4100: if (generate_interface_guard(kls, region) != nullptr) > > BTW why did you decide to change the order of the checks? I noticed that too. It is necessary for correctness now. It is incorrect and unsafe to use generate_interface_guard() on array after this change, because an array klass is not an InstanceKlass. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2558168496 From vlivanov at openjdk.org Tue Nov 25 01:28:47 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 25 Nov 2025 01:28:47 GMT Subject: RFR: 8372098: Move AccessFlags to InstanceKlass [v4] In-Reply-To: References: <4LqPcDIgdyt_jaF-mX4LVsgXzaHDxmDYTFptWYVdVXc=.1b12674f-51a4-4f03-bf99-50245b016051@github.com> Message-ID: On Tue, 25 Nov 2025 01:01:17 GMT, Dean Long wrote: >> src/hotspot/share/opto/library_call.cpp line 4100: >> >>> 4098: // Other types can report the actual _super. >>> 4099: // (To verify this code sequence, check the asserts in JVM_IsInterface.) >>> 4100: if (generate_interface_guard(kls, region) != nullptr) >> >> BTW why did you decide to change the order of the checks? > > I noticed that too. It is necessary for correctness now. It is incorrect and unsafe to use generate_interface_guard() on array after this change, because an array klass is not an InstanceKlass. Oh, that's subtle... It deserves a comment at least. We could also change `LibraryCallKit::generate_interface_guard()` to require `kls` to be of type `TypeInstKlassPtr`, but then we would need a cast before calling it from `LibraryCallKit::inline_native_Class_query()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28371#discussion_r2558217038 From fyang at openjdk.org Tue Nov 25 02:30:07 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 25 Nov 2025 02:30:07 GMT Subject: RFR: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC [v3] In-Reply-To: References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: On Mon, 24 Nov 2025 14:48:57 GMT, Feilong Jiang wrote: >> Fei Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> Review > > Marked as reviewed by fjiang (Committer). @feilongjiang @Hamlin-Li : Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28340#issuecomment-3573534156 From fyang at openjdk.org Tue Nov 25 02:32:47 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 25 Nov 2025 02:32:47 GMT Subject: Integrated: 8371869: RISC-V: too many warnings when build on BPI-F3 SBC In-Reply-To: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> References: <0GejwJfn48rA7xndDITOh8sdRmcxJwapeCz7zIHmXR4=.22854ffa-5105-498e-a530-12cca32ee609@github.com> Message-ID: On Sun, 16 Nov 2025 15:24:04 GMT, Fei Yang wrote: > Hi, please consider this riscv-specific change. > > I witnessed 400+ warning messages when doing a native build on BPI-F3 SBC running kernel 6.6.63: > > `OpenJDK 64-Bit Server VM warning: Cannot enable UseZvfh, it's missing dependent extension(s) v (disabled), Zfh (enabled)` > > The warning messages indicate that we won't auto-enable extensions like `Zvfh` due to lack of vector support on old kernels. > I think these warning messages could be confusing to people. It might be more reasonable to just log these messages. > This also unifies the way of logging prefering `log_info`. It doesn't seem necessary to me to use `log_debug` in this case. > > After this change, the log on BPI-F3 SBC looks like: > > $ java -Xlog:all -version > > ...... > [0.011s][info][os ] Linux kernels before 6.8.5 (current 6.6.63) have a known bug when using Vector and signals. > [0.011s][info][os ] Vector not enabled automatically via hwprobe, but can be turned on with -XX:+UseRVV. > [0.011s][info][os,cpu ] Enabled RV64 feature "a" > [0.011s][info][os,cpu ] Enabled RV64 feature "c" > [0.011s][info][os,cpu ] Enabled RV64 feature "d" > [0.011s][info][os,cpu ] Enabled RV64 feature "f" > [0.011s][info][os,cpu ] Enabled RV64 feature "i" > [0.011s][info][os,cpu ] Enabled RV64 feature "m" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zba" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbb" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zbs" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfh" > [0.011s][info][os,cpu ] Enabled RV64 feature "Zfhmin" > [0.011s][info][os,cpu ] Disabled RV64 feature "Zvfh" (missing dependent extension(s): v (disabled), Zfh (enabled)) > [0.011s][info][os,cpu ] Enabled RV64 feature "marchid" (-9223372035378380799) > [0.011s][info][os,cpu ] Enabled RV64 feature "mimpid" (1152921505839391232) > [0.011s][info][os,cpu ] Enabled RV64 feature "mvendorid" (1808) > [0.011s][info][os,cpu ] Enabled RV64 feature "satp_mode" (39) > [0.011s][info][os,cpu ] Enabled RV64 feature "unaligned_scalar" (3) > [0.011s][info][os,cpu ] Enabled RV64 feature "zicboz_block_size" (64) > [0.011s][info][os,cpu ] Zifencei not found, required by Linux, enabling. > [0.012s][info][os,cpu ] CPU: total 8 (initial active 8) spacemit,x60 rv64 rva rvc rvd rvf rvi rvm zba zbb zbs zfh zfhmin > ...... This pull request has now been integrated. Changeset: dea95e65 Author: Fei Yang URL: https://git.openjdk.org/jdk/commit/dea95e65a2493b545f78243025d1a5a4957a3806 Stats: 22 lines in 2 files changed: 13 ins; 6 del; 3 mod 8371869: RISC-V: too many warnings when build on BPI-F3 SBC Reviewed-by: fjiang, mli ------------- PR: https://git.openjdk.org/jdk/pull/28340 From fyang at openjdk.org Tue Nov 25 02:42:23 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 25 Nov 2025 02:42:23 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v6] In-Reply-To: <7kh5C9nj7bf6432cG35kDDvV6zhnKEspe8AcYetJ1do=.e1d9ebd3-d80d-4621-8c1e-c77dc721d0df@github.com> References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <7kh5C9nj7bf6432cG35kDDvV6zhnKEspe8AcYetJ1do=.e1d9ebd3-d80d-4621-8c1e-c77dc721d0df@github.com> Message-ID: On Mon, 24 Nov 2025 11:56:26 GMT, Hamlin Li wrote: >> Hi, >> >> This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`. >> >> This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231. >> >> Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work. >> >> # Test >> ## Jtreg >> >> in progress... >> >> ## Performance >> >> Column names meanings: >> * p: with patch >> * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> * m: without patch >> * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on >> >> #### Average improvement >> >> NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231. >> >> For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv. >> >> Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v) >> -- | -- | -- | -- >> 1.022782609 | 2.198717391 | 2.162673913 | 2.199 >> >> > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix is_unordered src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2141: > 2139: case BoolTest::gt: > 2140: cmov_fp_cmp_fp_gt(op1, op2, dst, src, cmp_single, cmov_single); > 2141: log_warning(jit)("Float/Double BoolTest::gt path is not tested well, please report the test case!"); My local tests show this does happen. Try this: `$ make test TEST="./test/jdk/javax/sound/midi/Gervill/SoftFilter/TestProcessAudio.java" TEST_VM_OPTS="-XX:-TieredCompilation"` I think this could be a good reference if you want to add some extra tests for the two cases here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2558363671 From jbhateja at openjdk.org Tue Nov 25 03:04:01 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 25 Nov 2025 03:04:01 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v6] In-Reply-To: References: Message-ID: <7-u4fTT6SMiqErNn-Xl7o8UTVF2NIV5m0DAhStsbsk0=.5f51025e-8ed8-4d2f-911c-1257b272f9f7@github.com> On Mon, 24 Nov 2025 22:01:17 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > spelling Very nice work @vpaprotsk , Please also add in comments the links to original reference implimentation. src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 365: > 363: > 364: static void loadXmms(const XMMRegister destinationRegs[], Register source, int offset, > 365: int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { Suggestion: int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 381: > 379: > 380: static void storeXmms(Register destination, int offset, const XMMRegister xmmRegs[], > 381: int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { Suggestion: int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 659: > 657: // zetas (int[128*8]) = c_rarg1 > 658: static address generate_dilithiumAlmostInverseNtt_avx(StubGenerator *stubgen, > 659: int vector_len,MacroAssembler *_masm) { Fix indentation src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 718: > 716: > 717: // Constants for shuffle and montMul64 > 718: __ mov64(scratch, 0b1010101010101010); 64 bit constant suffix src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 901: > 899: // poly2 (int[256]) = c_rarg2 > 900: static address generate_dilithiumNttMult_avx(StubGenerator *stubgen, > 901: int vector_len, MacroAssembler *_masm) { Fix indentation src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 939: > 937: vector_len, scratch); // 2^64 mod q > 938: if (vector_len == Assembler::AVX_512bit) { > 939: __ mov64(scratch, 0b0101010101010101); Add long constant suffix src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 985: > 983: // constant (int) = c_rarg1 > 984: static address generate_dilithiumMontMulByConstant_avx(StubGenerator *stubgen, > 985: int vector_len, MacroAssembler *_masm) { Fix indentation src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 1026: > 1024: __ evpbroadcastd(constant, rConstant, Assembler::AVX_512bit); // constant multiplier > 1025: > 1026: __ mov64(scratch, 0b0101010101010101); //dw-mask Constant suffix ------------- PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3503056034 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558380867 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558381318 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558385868 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558390552 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558390067 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558370904 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558391135 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2558371478 From amitkumar at openjdk.org Tue Nov 25 04:22:57 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 25 Nov 2025 04:22:57 GMT Subject: RFR: 8352567: [s390x] disable JFR tests requiring JFR stubs In-Reply-To: References: Message-ID: <5zDpNaj8aAQGGtEs8X0q8KN9A9szz6isGK4VVMvbE0s=.f7757a97-483e-49e2-af72-fcd74c3280e2@github.com> On Fri, 21 Nov 2025 01:48:27 GMT, Vladimir Petko wrote: > JFR stubs are not [implemented](https://github.com/openjdk/jdk/blame/06ba6cf3a137a6cdf572a876a46d18e51c248451/src/hotspot/cpu/s390/sharedRuntime_s390.cpp#L3412). > Add platform requirement to JFR tests that require JFR stubs to skip them on S390x. > > Testing: > - s390x: > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR SKIP > jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java > 0 0 0 0 0 > jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java > 0 0 0 0 0 > jtreg:test/jdk/jdk/jfr 630 577 0 0 53 > ============================== > TEST SUCCESS > > > - amd64: > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR SKIP > jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java > 1 1 0 0 0 > jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java > 1 1 0 0 0 > jtreg:test/jdk/jdk/jfr 629 622 0 0 7 > ============================== > TEST SUCCESS I would've suggested to use `@requires vm.continuations` that way test will be disabled if continuations support is not there. Or probably we can do simple problem listing, that way it will be little easy to enable the test case again. Reason: #if INCLUDE_JFR RuntimeStub* SharedRuntime::generate_jfr_write_checkpoint() { if (!Continuations::enabled()) return nullptr; Unimplemented(); return nullptr; } RuntimeStub* SharedRuntime::generate_jfr_return_lease() { if (!Continuations::enabled()) return nullptr; Unimplemented(); return nullptr; } #endif // INCLUDE_JFR we can see that we are going to return `nullptr` even if the stub is implemented in case `continuations` is disabled. But it seems once these stubs were implemented for other architectures the requirement for continuations support vanished. See aarch64 for example: https://github.com/openjdk/jdk/blob/dea95e65a2493b545f78243025d1a5a4957a3806/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp#L2870 https://github.com/openjdk/jdk/blob/dea95e65a2493b545f78243025d1a5a4957a3806/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp#L2832 @theRealAph could you suggest if this solution looks correct ? I see this testcase failing in tier1 tests on headstream: `java/foreign/sharedclosejfr/TestSharedCloseJFR.java ` could you check if it's the same case for you as well ? ------------- Changes requested by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/28444#pullrequestreview-3503196066 PR Comment: https://git.openjdk.org/jdk/pull/28444#issuecomment-3573719023 From kbarrett at openjdk.org Tue Nov 25 07:55:36 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 25 Nov 2025 07:55:36 GMT Subject: RFR: 8372337: clang compilation error on hardware_constructive_interference_size Message-ID: Please review this trivial fix for a compiler error when building with clang on Linux (and possibly in other configurations with clang). Rathern than attempt to version-conditionalize the deprecating declaration of the hardware interference variables, just never make the attempt at all, since we don't know of a version of clang that will handle the declaration as expected. Testing: Locally (linux-aarch64) built JDK with clang19.1 as the compiler. ------------- Commit messages: - don't try to deprecate hardare interference vars with any version of clang Changes: https://git.openjdk.org/jdk/pull/28484/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28484&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372337 Stats: 5 lines in 1 file changed: 0 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28484/head:pull/28484 PR: https://git.openjdk.org/jdk/pull/28484 From aboldtch at openjdk.org Tue Nov 25 08:16:32 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 25 Nov 2025 08:16:32 GMT Subject: RFR: 8366671: Refactor Thread::SpinAcquire and Thread::SpinRelease [v9] In-Reply-To: <3bBZzigernOcTkARE9am0ZmHR9NWsmp3xa0ksSLYiE8=.981d0f10-5b41-40c8-a35c-f953b0d1df08@github.com> References: <3bBZzigernOcTkARE9am0ZmHR9NWsmp3xa0ksSLYiE8=.981d0f10-5b41-40c8-a35c-f953b0d1df08@github.com> Message-ID: <8aNv20VZljWCq6M_lZKe6dhlahN0nHoH91JDHEeIo8Y=.3d917870-0e49-4e40-95e2-df9dcaa3fe4d@github.com> On Fri, 21 Nov 2025 09:25:27 GMT, Anton Artemov wrote: >> Hi, >> >> please consider the following changes: >> >> In this PR `Thread::SpinAcquire()` and `Thread::SpinRelease()` methods are refactored into a utility class `SpinCriticalSection`. The motivation is to make it easier for developers to use this lightweight synchronization mechanism in the codebase. The two aforementioned methods were used in JFR to create short critical sections with a helper class, but that was not the case for the object monitor code. >> >> Tested in tiers 1 - 5. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8366671: Addressed reviewer's comments. I think this looks good, but we should add a `NoSafepointVerifier` to this `SpinCriticalSection` RAII object. Polling for a safepoint inside a SpinLock should never be allowed. ------------- PR Review: https://git.openjdk.org/jdk/pull/28264#pullrequestreview-3503741606 From tschatzl at openjdk.org Tue Nov 25 08:19:23 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 25 Nov 2025 08:19:23 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v6] In-Reply-To: <6LdV5NiSfkvLTkYDsgV2jFyw43VFzOBzaGj2Enmgrnc=.b0ddbe6b-cd41-4913-867f-dec57ed79547@github.com> References: <6LdV5NiSfkvLTkYDsgV2jFyw43VFzOBzaGj2Enmgrnc=.b0ddbe6b-cd41-4913-867f-dec57ed79547@github.com> Message-ID: On Mon, 24 Nov 2025 09:48:52 GMT, Aleksey Shipilev wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [ ] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Indenting was still off Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28446#pullrequestreview-3503746974 From aboldtch at openjdk.org Tue Nov 25 08:20:55 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 25 Nov 2025 08:20:55 GMT Subject: RFR: 8372337: clang compilation error on hardware_constructive_interference_size In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 01:22:42 GMT, Kim Barrett wrote: > Please review this trivial fix for a compiler error when building with clang > on Linux (and possibly in other configurations with clang). Rathern than > attempt to version-conditionalize the deprecating declaration of the hardware > interference variables, just never make the attempt at all, since we don't > know of a version of clang that will handle the declaration as expected. > > Testing: Locally (linux-aarch64) built JDK with clang19.1 as the compiler. Looks good. Trivial. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28484#pullrequestreview-3503751338 From shade at openjdk.org Tue Nov 25 08:26:40 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Nov 2025 08:26:40 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 02:34:39 GMT, David Holmes wrote: >> There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. >> >> Testing: >> - manual inspection of hs_err file, for different GCs >> - tiers 1-3 sanity >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix include order All right then! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28470#pullrequestreview-3503780211 From kbarrett at openjdk.org Tue Nov 25 08:57:30 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 25 Nov 2025 08:57:30 GMT Subject: Integrated: 8372337: clang compilation error on hardware_constructive_interference_size In-Reply-To: References: Message-ID: <5nsAbxJEQSV2jPMqJRkHDbQobFBN5LnNHXz-WPMwXuw=.79dedbf1-1deb-40e1-824c-8f7366fce613@github.com> On Tue, 25 Nov 2025 01:22:42 GMT, Kim Barrett wrote: > Please review this trivial fix for a compiler error when building with clang > on Linux (and possibly in other configurations with clang). Rathern than > attempt to version-conditionalize the deprecating declaration of the hardware > interference variables, just never make the attempt at all, since we don't > know of a version of clang that will handle the declaration as expected. > > Testing: Locally (linux-aarch64) built JDK with clang19.1 as the compiler. This pull request has now been integrated. Changeset: ba3d4c43 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/ba3d4c43118bb5a2d9fb7cea9c6cd1ec63360ccd Stats: 5 lines in 1 file changed: 0 ins; 1 del; 4 mod 8372337: clang compilation error on hardware_constructive_interference_size Reviewed-by: aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/28484 From kbarrett at openjdk.org Tue Nov 25 08:57:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 25 Nov 2025 08:57:28 GMT Subject: RFR: 8372337: clang compilation error on hardware_constructive_interference_size In-Reply-To: References: Message-ID: <8EiynjxxkN20VKKWQ-6j1N7S8NrYaK66Ug3-REyY_9o=.4fff35e9-2b56-4e47-9a47-e96575e81cbf@github.com> On Tue, 25 Nov 2025 08:17:21 GMT, Axel Boldt-Christmas wrote: >> Please review this trivial fix for a compiler error when building with clang >> on Linux (and possibly in other configurations with clang). Rathern than >> attempt to version-conditionalize the deprecating declaration of the hardware >> interference variables, just never make the attempt at all, since we don't >> know of a version of clang that will handle the declaration as expected. >> >> Testing: Locally (linux-aarch64) built JDK with clang19.1 as the compiler. > > Looks good. Trivial. Thanks for review @xmas92 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28484#issuecomment-3574464927 From jkratochvil at openjdk.org Tue Nov 25 09:02:42 2025 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 25 Nov 2025 09:02:42 GMT Subject: RFR: 8372337: clang compilation error on hardware_constructive_interference_size In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 01:22:42 GMT, Kim Barrett wrote: > Please review this trivial fix for a compiler error when building with clang > on Linux (and possibly in other configurations with clang). Rathern than > attempt to version-conditionalize the deprecating declaration of the hardware > interference variables, just never make the attempt at all, since we don't > know of a version of clang that will handle the declaration as expected. > > Testing: Locally (linux-aarch64) built JDK with clang19.1 as the compiler. I expected something better, this will compile on clang and later fail on CI/gcc. I will try to make some some autoconf check to produce a declaration compatible with system library, if possible. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28484#issuecomment-3574496187 From jbhateja at openjdk.org Tue Nov 25 09:05:54 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 25 Nov 2025 09:05:54 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v4] In-Reply-To: References: Message-ID: > Add a new Float16lVector type and corresponding concrete vector classes, in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. > - Add necessary inline expander support. > - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. > - Use existing Float16 vector IR and backend support. > - Extended the existing VectorAPI JTREG test suite for the newly added Float16Vector operations. > > The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). > > The following are the performance numbers for some of the selected Float16Vector benchmarking kernels compared to equivalent auto-vectorized Float16OperationsBenchmark kernels. > > image > > Initial RFP[1] was floated on the panama-dev mailing list. > > Kindly review the draft PR and share your feedback. > > Best Regards, > Jatin > > [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Fix failing jtreg test in CI ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28002/files - new: https://git.openjdk.org/jdk/pull/28002/files/f34d324f..aca6cc5d Webrevs: - full: Webrev is not available because diff is too large - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28002&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28002/head:pull/28002 PR: https://git.openjdk.org/jdk/pull/28002 From mbaesken at openjdk.org Tue Nov 25 09:08:30 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 25 Nov 2025 09:08:30 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size [v4] In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: On Mon, 24 Nov 2025 14:53:33 GMT, Matthias Baesken wrote: >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : >> (before -> after setting the option) >> >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib >> 152K -> 132K images/jdk/lib/libjli.dylib >> 388K -> 296K images/jdk/lib/liblcms.dylib >> 164K -> 128K images/jdk/lib/libzip.dylib >> >> >> and libjvm : >> >> 20M -> 18M images/jdk/lib/server/libjvm.dylib >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Set the -dead_strip linker option only for the JDK libs Any comments on the recent version ? The dead_strip option seems to the rather similar to the link time gc option we support for gcc currently and offer this configure flag make/autoconf/jdk-options.m4 112 UTIL_ARG_ENABLE(NAME: linktime-gc, DEFAULT: $LINKTIME_GC_DEFAULT, 113 DEFAULT_DESC: [auto], RESULT: ENABLE_LINKTIME_GC, 114 DESC: [use link time gc on unused code sections in the JDK build], 115 CHECKING_MSG: [if linker should clean out unused code (linktime-gc)]) 116 AC_SUBST(ENABLE_LINKTIME_GC) Should we also put the dead_strip behind this configure flag and rephrase the description above a little bit , e.g. `114 DESC: [use link time gc or similar features on unused code sections in the JDK build],` ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3574548509 From sspitsyn at openjdk.org Tue Nov 25 09:16:06 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 25 Nov 2025 09:16:06 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v18] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 08:36:02 GMT, Johan Sj?len wrote: >> Hi, >> >> This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. >> >> We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. >> >> For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. >> >> On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. >> >> Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > IDE doesn't help you with VM structs! Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27198#pullrequestreview-3504064826 From mbaesken at openjdk.org Tue Nov 25 09:26:11 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 25 Nov 2025 09:26:11 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size [v4] In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: On Mon, 24 Nov 2025 14:53:33 GMT, Matthias Baesken wrote: >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : >> (before -> after setting the option) >> >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib >> 152K -> 132K images/jdk/lib/libjli.dylib >> 388K -> 296K images/jdk/lib/liblcms.dylib >> 164K -> 128K images/jdk/lib/libzip.dylib >> >> >> and libjvm : >> >> 20M -> 18M images/jdk/lib/server/libjvm.dylib >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Set the -dead_strip linker option only for the JDK libs Here are some current values (build from last evening 24th Nov.) for JDK native libs where I see a noticeable effect in size reduction with this PR (values in K without `->` with the flag , product build, XCode 15 used) 1216 -> 1188 /jdk/lib/libawt_lwawt.dylib 1348 -> 1036 /jdk/lib/libfontmanager.dylib 636 -> 620 /jdk/lib/libfreetype.dylib 264 -> 248 /jdk/lib/libjavajpeg.dylib 312 -> 292 /jdk/lib/libjdwp.dylib 152 -> 132 /jdk/lib/libjli.dylib 388 -> 296 /jdk/lib/liblcms.dylib 500 -> 484 /jdk/lib/libmlib_image.dylib 164 -> 128 /jdk/lib/libzip.dylib ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3574624069 From mli at openjdk.org Tue Nov 25 09:42:46 2025 From: mli at openjdk.org (Hamlin Li) Date: Tue, 25 Nov 2025 09:42:46 GMT Subject: RFR: 8357551: RISC-V: support CMoveF/D vectorization [v6] In-Reply-To: References: <0errm4F59Sa9JdJZKdAGBnt9cF1DKkUUv1XmUtMmHI8=.ab9c0d54-799c-4385-b96c-d7c698ffe965@github.com> <7kh5C9nj7bf6432cG35kDDvV6zhnKEspe8AcYetJ1do=.e1d9ebd3-d80d-4621-8c1e-c77dc721d0df@github.com> Message-ID: On Tue, 25 Nov 2025 02:38:52 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> fix is_unordered > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2141: > >> 2139: case BoolTest::gt: >> 2140: cmov_fp_cmp_fp_gt(op1, op2, dst, src, cmp_single, cmov_single); >> 2141: log_warning(jit)("Float/Double BoolTest::gt path is not tested well, please report the test case!"); > > My local tests show this does happen. Try this: > `$ make test TEST="./test/jdk/javax/sound/midi/Gervill/SoftFilter/TestProcessAudio.java" TEST_VM_OPTS="-XX:-TieredCompilation"` > > I think this could be a good reference if you want to add some extra tests for the two cases here. Thanks, I'll check it later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2559281739 From mbaesken at openjdk.org Tue Nov 25 10:13:48 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 25 Nov 2025 10:13:48 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 12:49:44 GMT, Francesco Andreuzzi wrote: > The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. No new issues so far, I think you can go ahead and integrate ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3574842911 From azafari at openjdk.org Tue Nov 25 10:14:46 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 25 Nov 2025 10:14:46 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v12] In-Reply-To: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: > The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. > The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. > > Tests: > linux-x64 tier1 Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: better type ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26955/files - new: https://git.openjdk.org/jdk/pull/26955/files/0aae9a42..d1294f6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26955&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26955&range=10-11 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26955.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26955/head:pull/26955 PR: https://git.openjdk.org/jdk/pull/26955 From azafari at openjdk.org Tue Nov 25 10:19:38 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 25 Nov 2025 10:19:38 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v13] In-Reply-To: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: > The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. > The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. > > Tests: > linux-x64 tier1 Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add - better type - fix arguments.cpp for HeapMinBaseAddress type. - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add - removed redundant check of overflow. - subtraction for checking overflow - fixed MAX2 template parameter - fixes. - uintptr_t -> uint64_t - fixes - ... and 3 more: https://git.openjdk.org/jdk/compare/489bd199...56f8b1f3 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26955/files - new: https://git.openjdk.org/jdk/pull/26955/files/d1294f6e..56f8b1f3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26955&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26955&range=11-12 Stats: 263566 lines in 2393 files changed: 177469 ins; 47924 del; 38173 mod Patch: https://git.openjdk.org/jdk/pull/26955.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26955/head:pull/26955 PR: https://git.openjdk.org/jdk/pull/26955 From azafari at openjdk.org Tue Nov 25 10:19:40 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 25 Nov 2025 10:19:40 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v11] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> Message-ID: On Thu, 20 Nov 2025 06:16:18 GMT, Axel Boldt-Christmas wrote: >> Okay but why `size_t` in places and `uintptr_t` in others? In this case `zerobased_max` seems an address not a size - similar to `highest_start` and `lowest_start` in the other part of the change. But then `OopEncodingHeapMax` is `uint64_t` so why not use that? >> >> I'm just not seeing the rules that are being applied here. > > Not using `uint64_t` I think was to be pragmatic because it is a different type than `uintptr_t` (on MacOS iirc). One is `unsigned long long` and the other is `unsigned long`. Causes issues with overload resolution for templated functions. (Maybe that was just the issue with the similarly typed `UnscaledOopHeapMax`) > > I think `OopEncodingHeapMax` is unfortunately typed. It might be intentional. Because we use it in two ways. > Either as the `Maximal size of heap`, or as the `zero based address: 0 + OopEncodingHeapMax` (the max end address of the `Maximal size of heap` Heap). In one case the type is more natural to be `size_t` and in the other it is `uintptr_t`. > > Right here though I agree type should be `uintptr_t`. We are using it as the max address our heap can end in. > > I would much rather we had > ```c++ > const uintptr_t zerobased_max = OopEncodingHeapMax; > > > In my opinion `UnscaledOopHeapMax` `OopEncodingHeapMax` should be typed as size_t, better named (to reflect their compressed oop nature and that they relate to the `Maximal size of heap`) and only be available in 64-bit VM (as using these in a 32-bit VM smells buggy). > And when we want to use it as the max end address we put it in a `uintptr_t` typed variable. `const uintptr_t` is used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2559404606 From azafari at openjdk.org Tue Nov 25 10:19:40 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 25 Nov 2025 10:19:40 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v11] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> <1p_wEewR-A5FFkJasTnjbE4brFCQIUNp7hmP8WfhV6g=.9ab47c1c-b749-484c-b282-9ad678a06d13@github.com> Message-ID: On Tue, 25 Nov 2025 10:14:39 GMT, Afshin Zafari wrote: >> Not using `uint64_t` I think was to be pragmatic because it is a different type than `uintptr_t` (on MacOS iirc). One is `unsigned long long` and the other is `unsigned long`. Causes issues with overload resolution for templated functions. (Maybe that was just the issue with the similarly typed `UnscaledOopHeapMax`) >> >> I think `OopEncodingHeapMax` is unfortunately typed. It might be intentional. Because we use it in two ways. >> Either as the `Maximal size of heap`, or as the `zero based address: 0 + OopEncodingHeapMax` (the max end address of the `Maximal size of heap` Heap). In one case the type is more natural to be `size_t` and in the other it is `uintptr_t`. >> >> Right here though I agree type should be `uintptr_t`. We are using it as the max address our heap can end in. >> >> I would much rather we had >> ```c++ >> const uintptr_t zerobased_max = OopEncodingHeapMax; >> >> >> In my opinion `UnscaledOopHeapMax` `OopEncodingHeapMax` should be typed as size_t, better named (to reflect their compressed oop nature and that they relate to the `Maximal size of heap`) and only be available in 64-bit VM (as using these in a 32-bit VM smells buggy). >> And when we want to use it as the max end address we put it in a `uintptr_t` typed variable. > > `const uintptr_t` is used. For your suggested changes, I need them be more precise and explicit to be able to implement. TIA ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26955#discussion_r2559408192 From goetz at openjdk.org Tue Nov 25 10:21:44 2025 From: goetz at openjdk.org (Goetz Lindenmaier) Date: Tue, 25 Nov 2025 10:21:44 GMT Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3] In-Reply-To: References: Message-ID: <93M0copXchKBlrEoFUpnBjq9gC2Om_9poLUMB15ECiw=.ee14e2c6-dd31-4e1b-9662-f0d3048d4db7@github.com> On Wed, 12 Nov 2025 15:46:09 GMT, Matthias Baesken wrote: >> Identical code folding can reduce the size of some libs, especially libjvm. However not all linkers support the flag/feature so we have to limit it to some environments. >> See for example this manpage : >> https://manpages.debian.org/testing/lld-7/ld.lld-7.1 >> >> >> sizes of libjvm.so with / without -icf=all >> linux aarch64 : 25888 / 27112 K >> linux x86_64 : 27952 / 29072 K >> >> >> (for most other native libs the identical code folding has no effect, because there is nothing to fold) > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Limit icf to release builds LGTM ------------- Marked as reviewed by goetz (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28236#pullrequestreview-3504348002 From ayang at openjdk.org Tue Nov 25 11:25:03 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 25 Nov 2025 11:25:03 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: <8gCdUiJXkclXAWJK-m6ALc8RCgzV3HE01vJIeYWqYuc=.3ddcf2f6-1853-4dbd-bca5-a2f5a2e2d96c@github.com> On Sat, 22 Nov 2025 12:49:44 GMT, Francesco Andreuzzi wrote: > The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28467#pullrequestreview-3504614956 From kevinw at openjdk.org Tue Nov 25 12:07:18 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 25 Nov 2025 12:07:18 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 19:55:24 GMT, Mat Carter wrote: >> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. >> >> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE >> >> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: >> >> TRUE >> FALSE >> >> Passes tier1 on linux (x64) and windows (x64) > > Mat Carter has updated the pull request incrementally with one additional commit since the last revision: > > Remove single whitespace src/jdk.management/share/classes/com/sun/management/internal/HotSpotAOTCacheImpl.java line 48: > 46: } > 47: > 48: public boolean endRecording(){ trivial missing space nit (){ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2559745591 From kevinw at openjdk.org Tue Nov 25 12:13:41 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 25 Nov 2025 12:13:41 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: On Thu, 13 Nov 2025 19:55:24 GMT, Mat Carter wrote: >> Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. >> >> The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE >> >> It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: >> >> TRUE >> FALSE >> >> Passes tier1 on linux (x64) and windows (x64) > > Mat Carter has updated the pull request incrementally with one additional commit since the last revision: > > Remove single whitespace src/jdk.management/share/classes/com/sun/management/internal/PlatformMBeanProviderImpl.java line 192: > 190: HotSpotAOTCacheMXBean impl = this.impl; > 191: if (impl == null) { > 192: this.impl = impl = new HotSpotAOTCacheImpl(ManagementFactoryHelper.getVMManagement()); This assignment is unusual. Are we trying to avoid a synchronized block? Other nameToMBeanMap() methods are like: return Collections.singletonMap(ManagementFactory.MEMORY_MXBEAN_NAME, ManagementFactoryHelper.getMemoryMXBean()); ..where the ManagementFactoryHelper.getMemoryMXBean() method is synchronized and creates the impl if needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2559762899 From jwaters at openjdk.org Tue Nov 25 12:20:12 2025 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 25 Nov 2025 12:20:12 GMT Subject: RFR: 8345265: Minor improvements for LTO across all compilers [v3] In-Reply-To: References: Message-ID: > This is a general cleanup and improvement of LTO, as well as a quick fix to remove a workaround in the Makefiles that disabled LTO for g1ParScanThreadState.cpp due to the old poisoning mechanism causing trouble. The -Wno-attribute-warning change here can be removed once Kim's new poisoning solution is integrated. > > - -fno-omit-frame-pointer is added to gcc to stop the linker from emitting code without the frame pointer > - -flto is set to $(JOBS) instead of auto to better match what the user requested > - -Gy is passed to the Microsoft compiler. This does not fully fix LTO under Microsoft, but prevents warnings about -LTCG:INCREMENTAL at least Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Revert recent changes to ClientLibraries.gmk - Revert recent changes to CompileJvm.gmk - Revert recent changes to Flags.gmk - Revert recent changes to NativeCompilation.gmk - Revert recent changes to spec.gmk.template - Revert recent changes to flags-ldflags.m4 - Revert recent changes to flags-cflags.m4 - Remove no longer needed warning disable in JvmFeatures.gmk - Merge branch 'master' into patch-16 - Merge branch 'openjdk:master' into patch-16 - ... and 11 more: https://git.openjdk.org/jdk/compare/34d6cc0d...9599d09e ------------- Changes: https://git.openjdk.org/jdk/pull/22464/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22464&range=02 Stats: 53 lines in 9 files changed: 11 ins; 38 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/22464.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22464/head:pull/22464 PR: https://git.openjdk.org/jdk/pull/22464 From alanb at openjdk.org Tue Nov 25 12:26:32 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 25 Nov 2025 12:26:32 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 12:10:02 GMT, Kevin Walls wrote: >> Mat Carter has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove single whitespace > > src/jdk.management/share/classes/com/sun/management/internal/PlatformMBeanProviderImpl.java line 192: > >> 190: HotSpotAOTCacheMXBean impl = this.impl; >> 191: if (impl == null) { >> 192: this.impl = impl = new HotSpotAOTCacheImpl(ManagementFactoryHelper.getVMManagement()); > > This assignment is unusual. Are we trying to avoid a synchronized block? Other nameToMBeanMap() methods are like: > return Collections.singletonMap(ManagementFactory.MEMORY_MXBEAN_NAME, ManagementFactoryHelper.getMemoryMXBean()); > > ..where the ManagementFactoryHelper.getMemoryMXBean() method is synchronized and creates the impl if needed. I don't see a correctly issue with this. Maybe in the future we will be able to use LazyConstant here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2559800006 From aph at openjdk.org Tue Nov 25 12:32:51 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 25 Nov 2025 12:32:51 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Tue, 5 Aug 2025 10:30:13 GMT, Fei Gao wrote: > In the existing implementation, the static call stub typically emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly sequence: > > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > > The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. > > While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. > > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > > All tests in Tier1 to Tier3, under both release and debug builds, have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads This one has gone very quiet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3575416105 From roland at openjdk.org Tue Nov 25 12:52:35 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 25 Nov 2025 12:52:35 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v4] In-Reply-To: References: Message-ID: <6qShqR-Ohv7vamoJ_B4Ev-poU8SB96eTBo4HFJrylcI=.dac5a26f-c9f0-445b-8f1c-a7c719fa27ae@github.com> > This is a variant of 8332827. In 8332827, an array access becomes > dependent on a range check `CastII` for another array access. When, > after loop opts are over, that RC `CastII` was removed, the array > access could float and an out of bound access happened. With the fix > for 8332827, RC `CastII`s are no longer removed. > > With this one what happens is that some transformations applied after > loop opts are over widen the type of the RC `CastII`. As a result, the > type of the RC `CastII` is no longer narrower than that of its input, > the `CastII` is removed and the dependency is lost. > > There are 2 transformations that cause this to happen: > > - after loop opts are over, the type of the `CastII` nodes are widen > so nodes that have the same inputs but a slightly different type can > common. > > - When pushing a `CastII` through an `Add`, if of the type both inputs > of the `Add`s are non constant, then we end up widening the type > (the resulting `Add` has a type that's wider than that of the > initial `CastII`). > > There are already 3 types of `Cast` nodes depending on the > optimizations that are allowed. Either the `Cast` is floating > (`depends_only_test()` returns `true`) or pinned. Either the `Cast` > can be removed if it no longer narrows the type of its input or > not. We already have variants of the `CastII`: > > - if the Cast can float and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and can't be removed when it doesn't narrow > the type of its input. > > What we need here, I think, is the 4th combination: > > - if the Cast can float and can't be removed when it doesn't narrow > the type of its input. > > Anyway, things are becoming confusing with all these different > variants named in ways that don't always help figure out what > constraints one of them operate under. So I refactored this and that's > the biggest part of this change. The fix consists in marking `Cast` > nodes when their type is widen in a way that prevents them from being > optimized out. > > Tobias ran performance testing with a slightly different version of > this change and there was no regression. Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: - review - review - Merge branch 'master' into JDK-8354282 - review - infinite loop in gvn fix - renaming - merge - Merge branch 'master' into JDK-8354282 - fix & test ------------- Changes: https://git.openjdk.org/jdk/pull/24575/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=03 Stats: 353 lines in 13 files changed: 252 ins; 27 del; 74 mod Patch: https://git.openjdk.org/jdk/pull/24575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24575/head:pull/24575 PR: https://git.openjdk.org/jdk/pull/24575 From mbaesken at openjdk.org Tue Nov 25 13:00:32 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 25 Nov 2025 13:00:32 GMT Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 15:46:09 GMT, Matthias Baesken wrote: >> Identical code folding can reduce the size of some libs, especially libjvm. However not all linkers support the flag/feature so we have to limit it to some environments. >> See for example this manpage : >> https://manpages.debian.org/testing/lld-7/ld.lld-7.1 >> >> >> sizes of libjvm.so with / without -icf=all >> linux aarch64 : 25888 / 27112 K >> linux x86_64 : 27952 / 29072 K >> >> >> (for most other native libs the identical code folding has no effect, because there is nothing to fold) > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Limit icf to release builds Thanks for the review ! Maybe someone from build-dev show review too ? > (for most other native libs the identical code folding has no effect, because there is nothing to fold) We could limit the usage of the flag to libjvm where it shows effect ; this would be similar to Windows , where we set a similar flag also only for jvm lib `BASIC_LDFLAGS_JVM_ONLY="-opt:icf,8 -subsystem:windows"` ------------- PR Comment: https://git.openjdk.org/jdk/pull/28236#issuecomment-3575519268 From aph at openjdk.org Tue Nov 25 13:08:47 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 25 Nov 2025 13:08:47 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v3] In-Reply-To: References: Message-ID: <-cnMy4YHNCrKRqt_2Kkh9ksi-qE8ndZLB5yoyKkS3gM=.3f328f98-15a2-4736-9a6c-f9ab0705b830@github.com> On Mon, 24 Nov 2025 20:56:32 GMT, Erik ?sterlund wrote: > It's a bit surprising to me if they invalidate all TLB entries, effectively ripping out the entire virtual address space, even when a range is passed in. If so, "Because the cache-maintenance wasn't needed, we can do the TLBI instead. In fact, the I-Cache line-size isn't relevant anymore, we can reduce the number of traps by producing a fake value. "For user-space, the kernel's work is now to trap CTR_EL0 to hide DIC, and produce a fake IminLine. EL3 traps the now-necessary I-Cache maintenance and performs the inner-shareable-TLBI that makes everything better." My interpretation of this is that we only need to do the synchronization dance once, at the end of the patching. But I guess we don't know exactly if we have an affected core or if the kernel workaround is in action. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3575547686 From mbaesken at openjdk.org Tue Nov 25 13:12:06 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 25 Nov 2025 13:12:06 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size [v5] In-Reply-To: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: > The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. > Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : > (before -> after setting the option) > > 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib > 264K -> 248K images/jdk/lib/libjavajpeg.dylib > 152K -> 132K images/jdk/lib/libjli.dylib > 388K -> 296K images/jdk/lib/liblcms.dylib > 164K -> 128K images/jdk/lib/libzip.dylib > > > and libjvm : > > 20M -> 18M images/jdk/lib/server/libjvm.dylib > 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Use dead_strip on macOS arrch64 AND x86_64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28319/files - new: https://git.openjdk.org/jdk/pull/28319/files/b63b9ca8..c158747f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28319&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28319&range=03-04 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28319.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28319/head:pull/28319 PR: https://git.openjdk.org/jdk/pull/28319 From mbaesken at openjdk.org Tue Nov 25 13:12:08 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 25 Nov 2025 13:12:08 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size [v4] In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: <5_ovGdYHonxBCe-7ftDT2kErYBHe-8HPv0cwme_2Mj8=.1479051a-0fa5-455f-89c3-1312f2d16017@github.com> On Mon, 24 Nov 2025 14:53:33 GMT, Matthias Baesken wrote: >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : >> (before -> after setting the option) >> >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib >> 152K -> 132K images/jdk/lib/libjli.dylib >> 388K -> 296K images/jdk/lib/liblcms.dylib >> 164K -> 128K images/jdk/lib/libzip.dylib >> >> >> and libjvm : >> >> 20M -> 18M images/jdk/lib/server/libjvm.dylib >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Set the -dead_strip linker option only for the JDK libs Now we only set the flag for the JDK native libs, so I enabled it also for macOS x86_64 . ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3575562810 From aph at openjdk.org Tue Nov 25 13:31:28 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 25 Nov 2025 13:31:28 GMT Subject: RFR: 8365991: AArch64: Ignore BlockZeroingLowLimit when UseBlockZeroing is false [v7] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 07:40:19 GMT, Patrick Zhang wrote: > I wondered why `MacroAssembler::zero_words` uses 16 words to do `stp` unrolling, while `generate_zero_blocks()` 8 words (`const int MacroAssembler::zero_words_block_size = 8;`), so defined this variable to compare `8 vs 16` but did not find obvious performance difference. > > Regarding the var name `block_size`, could `unroll` or `unroll_words` be better? What's wrong with 16? I'm asking not from a "my teachers said always name constants" point of view, but from a reader's understanding point of view. Named constants are all well and good if the constant has some meaning, but this one is just two words. Perhaps `2 * WordSize` would do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26917#discussion_r2560000825 From jsjolen at openjdk.org Tue Nov 25 13:43:31 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 25 Nov 2025 13:43:31 GMT Subject: RFR: 8367656: Refactor Constantpool's operand array into two [v18] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 08:36:02 GMT, Johan Sj?len wrote: >> Hi, >> >> This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. >> >> We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. >> >> For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. >> >> On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. >> >> Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > IDE doesn't help you with VM structs! Thank you all for your reviews :-). ------------- PR Comment: https://git.openjdk.org/jdk/pull/27198#issuecomment-3575718289 From jsjolen at openjdk.org Tue Nov 25 13:46:41 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 25 Nov 2025 13:46:41 GMT Subject: Integrated: 8367656: Refactor Constantpool's operand array into two In-Reply-To: References: Message-ID: On Wed, 10 Sep 2025 16:19:17 GMT, Johan Sj?len wrote: > Hi, > > This is a refactoring of the way that we store the Bootstrap method attribute in the ConstantPool class. We used to have a single `Array` which was divided into a section of `u4` offsets and a section which was the actual data. In this refactoring we make this split more clear, by actually allocating an `Array` to store the offsets in and an `Array` to store the data in. These arrays are then put into a `BSMAttributeEntries` class, which allows us to separate out the API from that of the rest of the `ConstantPool`. > > We had multiple instances of the code knowing the layout of the operands array and using this to do 'clever' ways of copying and inserting data into it. See `ConstantPool::copy_operands` and `ConstantPool::resize_operands`. I felt like we could do things in a simpler way, so I added the `start_/end_extension` protocol and added the `InsertionIterator` for this. See `ClassFileParser::parse_classfile_bootstrap_methods_attribute` for how this works. I put several relevant definitions into the inline file in hopes of encouraging the compiler to optimize these appropriately. > > For the Java SA code, I had to add a `U4Array` class. I also had to fix the vmstructs definitions, etc. > > On the whole, while this code is a bit less terse, I think it's a good API improvement and the underlying implementation of splitting up the operands array is also an improvement. > > Testing: Oracle Tier1-Tier5 has been run succesfully multiple times. Before integration, I will merge with master and run these tiers again. This pull request has now been integrated. Changeset: d94c52cc Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/d94c52ccf2fed3fc66d25a34254c9b581c175fa1 Stats: 895 lines in 12 files changed: 436 ins; 282 del; 177 mod 8367656: Refactor Constantpool's operand array into two Reviewed-by: coleenp, sspitsyn, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/27198 From aboldtch at openjdk.org Tue Nov 25 13:53:39 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 25 Nov 2025 13:53:39 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v13] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: <74oLeX1HPHJRDQAkhU6RcgEdMGm0GIZVHg9qN5zYFJU=.e398fead-e46e-4486-98f5-f351cb748452@github.com> On Tue, 25 Nov 2025 10:19:38 GMT, Afshin Zafari wrote: >> The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. >> The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. >> >> Tests: >> linux-x64 tier1 > > Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add > - better type > - fix arguments.cpp for HeapMinBaseAddress type. > - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add > - removed redundant check of overflow. > - subtraction for checking overflow > - fixed MAX2 template parameter > - fixes. > - uintptr_t -> uint64_t > - fixes > - ... and 3 more: https://git.openjdk.org/jdk/compare/fd5569b7...56f8b1f3 Marked as reviewed by aboldtch (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26955#pullrequestreview-3505172767 From erikj at openjdk.org Tue Nov 25 14:00:45 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Tue, 25 Nov 2025 14:00:45 GMT Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 15:46:09 GMT, Matthias Baesken wrote: >> Identical code folding can reduce the size of some libs, especially libjvm. However not all linkers support the flag/feature so we have to limit it to some environments. >> See for example this manpage : >> https://manpages.debian.org/testing/lld-7/ld.lld-7.1 >> >> >> sizes of libjvm.so with / without -icf=all >> linux aarch64 : 25888 / 27112 K >> linux x86_64 : 27952 / 29072 K >> >> >> (for most other native libs the identical code folding has no effect, because there is nothing to fold) > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Limit icf to release builds If hotspot is ok with this, then it looks fine to me. ------------- Marked as reviewed by erikj (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28236#pullrequestreview-3505209965 From erikj at openjdk.org Tue Nov 25 14:03:31 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Tue, 25 Nov 2025 14:03:31 GMT Subject: RFR: 8371893: [macOS aarch64] use dead_strip linker option to reduce binary size [v5] In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: On Tue, 25 Nov 2025 13:12:06 GMT, Matthias Baesken wrote: >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : >> (before -> after setting the option) >> >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib >> 152K -> 132K images/jdk/lib/libjli.dylib >> 388K -> 296K images/jdk/lib/liblcms.dylib >> 164K -> 128K images/jdk/lib/libzip.dylib >> >> >> and libjvm : >> >> 20M -> 18M images/jdk/lib/server/libjvm.dylib >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Use dead_strip on macOS arrch64 AND x86_64 Marked as reviewed by erikj (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28319#pullrequestreview-3505215274 From stefank at openjdk.org Tue Nov 25 14:29:55 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 25 Nov 2025 14:29:55 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment Message-ID: While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. When running with: java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version The following code: size_t displacement_due_to_null_page = align_up(os::vm_page_size(), _conservative_max_heap_alignment) triggers: # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 because `_conservative_max_heap_alignment` is not a power-of-2. This happens because Shenandoah's `conservative_max_heap_alignment()` returns a potentially unaligned `ShenandoahMaxRegionSize` value. size_t ShenandoahArguments::conservative_max_heap_alignment() { size_t align = ShenandoahMaxRegionSize; if (UseLargePages) { align = MAX2(align, os::large_page_size()); } return align; } I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java WDYT, is this an OK fix for this corner-case? ------------- Commit messages: - 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment Changes: https://git.openjdk.org/jdk/pull/28492/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28492&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372513 Stats: 14 lines in 3 files changed: 13 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28492.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28492/head:pull/28492 PR: https://git.openjdk.org/jdk/pull/28492 From cnorrbin at openjdk.org Tue Nov 25 14:40:48 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Tue, 25 Nov 2025 14:40:48 GMT Subject: RFR: 8253683: Clean up and clarify uses of os::vm_allocation_granularity Message-ID: Hi everyone, `os::vm_allocation_granularity()` is meant to describe the alignment restrictions of the operating system when we reserve memory. That is 64 KiB on Windows (`VirtualAlloc`) and 256 MiB on AIX (with `shmat`). On every other platform it happens to match the page size. The page size (available via `os::vm_page_size()`) is what matters when we later commit or protect the reserved pages. Because the functions are poorly documented and the two numbers are identical on most systems, they have gradually been used more and more interchangeably. We now have many code paths that round **sizes** up to `os::vm_allocation_granularity()` or assert that a size is a multiple of it. That is wrong. Only addresses need that alignment, sizes merely have to be page-aligned. Places that round sizes should instead use `os::vm_page_size()` as they are unrelated to attach alignment. For this change I have gone over the call sites of `os::vm_allocation_granularity()` and where it was being used to pad or sanity-check a size I have instead replaced it with `os::vm_page_size()`. The calls that genuinely deal with an attach address are left untouched. Testing: - Oracle tiers 1-8 ------------- Commit messages: - Changed allocation_granularity for page_size for aligning sizes Changes: https://git.openjdk.org/jdk/pull/28493/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28493&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8253683 Stats: 58 lines in 17 files changed: 3 ins; 4 del; 51 mod Patch: https://git.openjdk.org/jdk/pull/28493.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28493/head:pull/28493 PR: https://git.openjdk.org/jdk/pull/28493 From jsikstro at openjdk.org Tue Nov 25 15:11:41 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Tue, 25 Nov 2025 15:11:41 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? I think this looks good. ------------- Marked as reviewed by jsikstro (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28492#pullrequestreview-3505546339 From fandreuzzi at openjdk.org Tue Nov 25 15:22:07 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 25 Nov 2025 15:22:07 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 10:11:45 GMT, Matthias Baesken wrote: >> The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. > > No new issues so far, I think you can go ahead and integrate ! Thank you for the review @MBaesken and @albertnetymk ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3576186925 From duke at openjdk.org Tue Nov 25 15:22:09 2025 From: duke at openjdk.org (duke) Date: Tue, 25 Nov 2025 15:22:09 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 12:49:44 GMT, Francesco Andreuzzi wrote: > The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. @fandreuz Your change (at version bd6d8ff582f23c9a68b0d241821eae817f738d4a) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3576194657 From eosterlund at openjdk.org Tue Nov 25 15:34:31 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 25 Nov 2025 15:34:31 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28492#pullrequestreview-3505661687 From stefank at openjdk.org Tue Nov 25 15:51:25 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 25 Nov 2025 15:51:25 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? I realized that having an unconstrained flag like this can cause overflows and asserts because of that. If we want to fix that we could do something like the following: https://github.com/stefank/jdk/compare/8372513_shenandoah_max_region_size...stefank:jdk:8372513_shenandoah_max_region_size_constraints Then running java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=9223372036854775808 -version Will give this error message instead of an assert: $ java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=9223372036854775808 -version ShenandoahMaxRegionSize 8589934592G should be lower than (8589934592G). Improperly specified VM option 'ShenandoahMaxRegionSize=9223372036854775808' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28492#issuecomment-3576326172 From shade at openjdk.org Tue Nov 25 15:53:31 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Nov 2025 15:53:31 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v4] In-Reply-To: References: Message-ID: <4jSftGzOMv1vhmjj010ggpWeOaiPTUBnbQulEGBd2Fs=.efccffa3-5f83-49fa-8afd-3f296a5c07cd@github.com> On Fri, 21 Nov 2025 18:29:57 GMT, Vladimir Kozlov wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Adjust label name >> - Merge branch 'master' into JDK-8372285-g1-barrier-micro >> - Make some backward branches explicitly short >> - Comment >> - Shorten a few more branches >> - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short >> - More touchups >> - Also optimize queue insertion >> - Touchups >> - WIP > > Comments. @vnkozlov, are you happy with this version? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28446#issuecomment-3576335343 From shade at openjdk.org Tue Nov 25 15:57:39 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 25 Nov 2025 15:57:39 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? Yes, this is the correct fix. Thanks for handling this! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28492#pullrequestreview-3505760160 From duke at openjdk.org Tue Nov 25 16:33:46 2025 From: duke at openjdk.org (Zihao Lin) Date: Tue, 25 Nov 2025 16:33:46 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 13:12:59 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: >> >> - fix assert >> - add more assert >> - rid of access.addr().type() >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Fix build >> - Fix test failed >> - 8344116: C2: remove slice parameter from LoadNode::make > > src/hotspot/share/opto/callnode.cpp line 1740: > >> 1738: Node* klass_node = in(AllocateNode::KlassNode); >> 1739: Node* proto_adr = phase->transform(new AddPNode(klass_node, klass_node, phase->MakeConX(in_bytes(Klass::prototype_header_offset())))); >> 1740: mark_node = LoadNode::make(*phase, control, mem, proto_adr, TypeX_X, TypeX_X->basic_type(), MemNode::unordered); > > We could assert that C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw I give it a try, but it won't pass the test. Is it possible the original version is wrong? The class mark will not be `TypeRawPtr::BOTTOM`, it should equal to Klass slice index. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2560657848 From eastigeevich at openjdk.org Tue Nov 25 16:38:30 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 25 Nov 2025 16:38:30 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v4] In-Reply-To: References: Message-ID: <5WnunfPY0pMpXLI5rM4Sz321j-ZQU1Hw_L1UDdZwXRo=.b20e8695-0243-408f-886e-c848d583c60a@github.com> > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Add inline assembly to ICacheInvalidationContext::pd_invalidate_icache not using nmethod ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/20480771..79297bd4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=02-03 Stats: 56 lines in 9 files changed: 22 ins; 6 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From eastigeevich at openjdk.org Tue Nov 25 16:44:38 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 25 Nov 2025 16:44:38 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v4] In-Reply-To: References: <85dBRXdwqMOffQvXGI9J_zhfLnwZ0LrY_Wj4w0nrpbM=.88de1041-c752-49aa-8ab2-600b92f8559d@github.com> Message-ID: On Thu, 20 Nov 2025 16:35:17 GMT, Evgeny Astigeevich wrote: >> We cannot execute `tlbi vae3is` here because it requires EL3. We are at EL0. > > Or you mean `IC IVAU`?` I replaced the call of `ICache::invalidate_word()` with: asm volatile("dsb ish \n" "ic ivau, xzr \n" "isb \n" : : : "memory"); The code executed in `ICache::invalidate_word()` when all checks are done: dsb ish ic ivau dsb ish isb I use `xzr` in `ic ivau` because an address in it does not matter. The instruction is trapped and ignored. I think we don't need the second `dsb` because we will have `dsb sy` in the trap handler. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2560695024 From eastigeevich at openjdk.org Tue Nov 25 16:54:33 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 25 Nov 2025 16:54:33 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v5] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Fix build issue on non aarch64 platforms ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/79297bd4..e774cda1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From egahlin at openjdk.org Tue Nov 25 17:34:34 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 25 Nov 2025 17:34:34 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 12:49:44 GMT, Francesco Andreuzzi wrote: > The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. Looks like getValue() is never used. Can it be removed, and the reflection code? Do we really need the fields to be volatile? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3576632211 From eastigeevich at openjdk.org Tue Nov 25 17:37:38 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 25 Nov 2025 17:37:38 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v6] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with two additional commits since the last revision: - Remove redundant include - Move ICacheInvalidationContext::pd_ to icache_linux_aarch64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/e774cda1..42745e56 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=04-05 Stats: 155 lines in 5 files changed: 88 ins; 67 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From mablakatov at openjdk.org Tue Nov 25 18:00:10 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Tue, 25 Nov 2025 18:00:10 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v9] In-Reply-To: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: > Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. > > This has passed tier1-3 and jcstress testing on AArch64. Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: - Merge commit '8bafc2f0aecbbe548573712a9dc31c9764f82f71' into 8359359 - cleanup: make tests source code more readable - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 - the only trampoline in ArrayCopyStub is never shared - fixup: a shared trampoline must branch to a statically bound method - share static call trampolines generated by C1 as well - assert callee is nullptr for runtime calls - assert that call sites offsets aren't missing - cleanup: rephrase comments in macroAssembler_aarch64.hpp - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 - ... and 10 more: https://git.openjdk.org/jdk/compare/8bafc2f0...bd065434 ------------- Changes: https://git.openjdk.org/jdk/pull/25954/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25954&range=08 Stats: 461 lines in 12 files changed: 334 ins; 114 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/25954.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25954/head:pull/25954 PR: https://git.openjdk.org/jdk/pull/25954 From duke at openjdk.org Tue Nov 25 18:02:44 2025 From: duke at openjdk.org (duke) Date: Tue, 25 Nov 2025 18:02:44 GMT Subject: RFR: 8363943: ARM32: Represent Registers as values [v2] In-Reply-To: References: Message-ID: <07JZI815ZVIBspZkeP2djon6P2kBSS5HF7D_FpS1UwQ=.17697ead-dac8-4262-ae55-9f4d04b4331e@github.com> On Tue, 11 Nov 2025 22:09:24 GMT, Ivan wrote: >> Migrate away from pointer-based representation of Register values. >> >> It improves compile-time checking by forbidding implicit conversions between integrals and pointers. >> >> [JDK-8363943](https://bugs.openjdk.org/browse/JDK-8363943) > > Ivan has updated the pull request incrementally with one additional commit since the last revision: > > Proposed review changes were applied @iv-sukhanov Your change (at version 962b01b602c0f42b95f8a3ad4f58d84b17db3c6f) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26525#issuecomment-3576859150 From duke at openjdk.org Tue Nov 25 18:17:17 2025 From: duke at openjdk.org (Ivan) Date: Tue, 25 Nov 2025 18:17:17 GMT Subject: Integrated: 8363943: ARM32: Represent Registers as values In-Reply-To: References: Message-ID: On Tue, 29 Jul 2025 06:46:56 GMT, Ivan wrote: > Migrate away from pointer-based representation of Register values. > > It improves compile-time checking by forbidding implicit conversions between integrals and pointers. > > [JDK-8363943](https://bugs.openjdk.org/browse/JDK-8363943) This pull request has now been integrated. Changeset: c1230068 Author: Ivan Sukhanov Committer: Alexey Bakhtin URL: https://git.openjdk.org/jdk/commit/c1230068dc4501c52999ac0bbb3a2e5933453f09 Stats: 428 lines in 12 files changed: 144 ins; 53 del; 231 mod 8363943: ARM32: Represent Registers as values Reviewed-by: shade, bulasevich ------------- PR: https://git.openjdk.org/jdk/pull/26525 From aboldtch at openjdk.org Tue Nov 25 18:43:30 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 25 Nov 2025 18:43:30 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange Message-ID: AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. This restriction added a lot of extra logic to the Atomic implementation because we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ * Testing * Extended gtest / (no other users of Atomic byte with exchange exists. * GHA * Running Tier 1-5 on Oracle supported platforms ------------- Commit messages: - Unify atomic exchange and compare exchange Changes: https://git.openjdk.org/jdk/pull/28498/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28498&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372528 Stats: 250 lines in 15 files changed: 108 ins; 113 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/28498.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28498/head:pull/28498 PR: https://git.openjdk.org/jdk/pull/28498 From pchilanomate at openjdk.org Tue Nov 25 19:58:40 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:40 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> On Thu, 20 Nov 2025 22:17:53 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Add Alan's comment in VirtualThread > > src/hotspot/share/classfile/javaClasses.cpp line 1757: > >> 1755: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); >> 1756: int val = AtomicAccess::load(addr); >> 1757: AtomicAccess::store(addr, val + 1); > > Suggestion: > > AtomicAccess::inc(addr); Same here. > src/hotspot/share/classfile/javaClasses.cpp line 1764: > >> 1762: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); >> 1763: int val = AtomicAccess::load(addr); >> 1764: AtomicAccess::store(addr, val - 1); > > Suggestion: > > AtomicAccess::dec(addr); I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. > src/hotspot/share/opto/runtime.hpp line 740: > >> 738: return vthread_transition_Type(); >> 739: } >> 740: > > I do not know C2 but this looks really strange - 4 different functions all return the same thing. ??? We need to define them because the `GEN_C2_STUB` macro will look for the type of the C function based on its name (`C2_STUB_TYPEFUNC(name)`), otherwise we get a compilation failure. The four C functions have the same type though so they all return `_vthread_transition_Type`. > src/hotspot/share/runtime/handshake.cpp line 374: > >> 372: JavaThread* target = java_lang_Thread::thread(carrier_thread); >> 373: assert(target != nullptr, ""); >> 374: // Technically there is need for a ThreadsListHandle since the target > > Suggestion: > > // Technically there is no need for a ThreadsListHandle since the target > > ? Yes, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561198741 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561198549 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561200538 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561202212 From pchilanomate at openjdk.org Tue Nov 25 19:58:34 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:34 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v5] In-Reply-To: References: Message-ID: <8QdmTglMpQwGWG0QeQLbeduPrF1qZkah-9RzQwSOQuY=.fa030def-7674-48e3-bc25-23358009ed87@github.com> > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: David's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/10534b33..b54594c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=03-04 Stats: 46 lines in 4 files changed: 18 ins; 1 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Tue Nov 25 19:58:43 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:43 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v5] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 22:26:26 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> David's comments > > src/hotspot/share/prims/jvm.cpp line 3668: > >> 3666: if (!DoJVMTIVirtualThreadTransitions) { >> 3667: assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); >> 3668: return; > > Does this not still need checking somewhere? The check for `DoJVMTIVirtualThreadTransitions` was moved to the `JVMTIStartTransition\JVMTIEndTransition` classes, but I guess you refer to the assert: I missed to move it. Added now too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561201839 From pchilanomate at openjdk.org Tue Nov 25 19:58:45 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:45 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 08:37:39 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.cpp line 147: >> >>> 145: MonitorLocker ml(VTMSTransition_lock); >>> 146: while (is_start_transition_disabled(current, vth())) { >>> 147: ml.wait(200); >> >> I see a lot of timed-waits throughout this code. Is that because we poll rather than synchronizing properly? All this potential busy-waiting is surely going to cause performance glitches. > > The timeouts are for reliability purposes only. Technically, they are not needed and can be removed after this code becomes stable. The `wait()` calls are inside while loop which rechecks the loop-ending conditions. I tried to minimize the changes with respect to the current code so I kept the timed-waits. As Serguei points out we should be able to remove this particular one. As for the ones executed by the disablers, we could make them poll for the transition bits in a loop with backoff, similar to how we do it in safepoint and handshake cases. But I agree with Serguei we should do it in a separate bug once the code is stable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561204423 From pchilanomate at openjdk.org Tue Nov 25 19:58:47 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:47 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 01:01:48 GMT, David Holmes wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162: >> >>> 160: // be executed once we go back to Java. If this is an unmount, the handshake that the >>> 161: // disabler executed against this carrier thread already provided the needed synchronization. >>> 162: // This matches the release fence in xx_enable_for_one()/xx_enable_for_all(). >> >> Subtle. Do we have comments where the fences are to ensure people realize the fence is serving this purpose? > > I also forgot to suggest a wording change: say "pairs with" rather than "matches". Reading back through I realize now I have misunderstood many of these comments. Changed to `pairs with`. I rewrote the comments so hopefully?they are more clear now. I also added a comment in `VirtualThread.mount/unmount` where the memory barriers should be. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561205763 From pchilanomate at openjdk.org Tue Nov 25 19:58:53 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:58:53 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 00:42:32 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 277: > >> 275: >> 276: // Start of the critical region. Prevent future memory >> 277: // operations to be ordered before we read the transition flag. > > Does this refer to `java_lang_Thread::is_in_VTMS_transition(_vthread())`? If so perhaps that should internally perform the `load_acquire`? Yes, but that would also call for doing the same with `JavaThread::_is_in_VTMS_transition` for the `VTMS_transition_disable_for_all` case, and also have the pairing release stores by the virtual thread in `end_transition` on those same addresses, otherwise it would be confusing. And same with the other side, i.e doing load_acquire by the virtual thread of `_VTMS_transition_disable_count` and `_global_start_transition_disable_count` on `start_transition` and release store by the disabler when enabling transitions again. But I wanted to avoid unnecessary barriers in the virtual thread transition side, so I kept them as plain load/stores with separate memory barriers when necessary. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307: > >> 305: // Block while some mount/unmount transitions are in progress. >> 306: // Debug version fails and prints diagnostic information. >> 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { > > This looks very odd, having an assignment in the loop condition check and no actual loop-update expression. Yes, from what I see this same construct is used in many places. Seems this is valid because a pointer used in a boolean context evaluates to false if nullptr and true if non-null. :) > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 316: > >> 314: // operations to be ordered before we read the transition flags. >> 315: // This matches the release fence in end_transition(). >> 316: OrderAccess::acquire(); > > Surely the use of the iterator already provides the necessary ordering guarantee here as well. ? We still need it because we need to prevent reordering of loads from the critical section with loads of `jt->is_in_VTMS_transition()`. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 327: > >> 325: // End of the critical section. Prevent previous memory operations to >> 326: // be ordered after we clear the clear the disable transition flag. >> 327: // This matches the equivalent acquire fence in start_transition(). > > Suggestion: > > // This pairs with the acquire in start_transition(). > > I just realized you are using "fence" to describe release and acquire memory barrier semantics. Given we have an operation `fence` I find this confusing for the reader - especially when we also have a `release_store_fence` operation which might be confused with "release fence". Right, I changed it now to use the terms acquire and release barrier respectively. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 370: > >> 368: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); >> 369: assert(_global_start_transition_disable_count >= 0, ""); >> 370: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count + 1); > > Suggestion: > > AtomicAccess::inc(&_global_start_transition_disable_count); I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 376: > >> 374: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked"); >> 375: assert(_global_start_transition_disable_count > 0, ""); >> 376: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count - 1); > > Suggestion: > > AtomicAccess::dec(&_global_start_transition_disable_count); Same here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561208616 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561210899 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561216984 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561219344 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561219842 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561220292 From pchilanomate at openjdk.org Tue Nov 25 19:59:38 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 19:59:38 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 01:23:17 GMT, David Holmes wrote: > > we follow the classic Dekker pattern for the required synchronization. > > My understanding is that Dekker requires a "full fence" between the accesses, not just ordering memory barriers. The two variables involved must be published to all readers for the algorithm to work. > No need to argue too much about this one because `StoreLoad` is implemented as a full fence so we can easily change it, but from reading the definitions in `OrderAccess` my understanding was that technically it should be enough. The `StoreStore` comment clarifies the meaning of word `completes` (used later in `StoreLoad`) as `the effect on memory of Store1 is made visible to other processors`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28361#issuecomment-3577347289 From vpetko at openjdk.org Tue Nov 25 20:04:19 2025 From: vpetko at openjdk.org (Vladimir Petko) Date: Tue, 25 Nov 2025 20:04:19 GMT Subject: RFR: 8352567: [s390x] disable JFR tests requiring JFR stubs In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 01:48:27 GMT, Vladimir Petko wrote: > JFR stubs are not [implemented](https://github.com/openjdk/jdk/blame/06ba6cf3a137a6cdf572a876a46d18e51c248451/src/hotspot/cpu/s390/sharedRuntime_s390.cpp#L3412). > Add platform requirement to JFR tests that require JFR stubs to skip them on S390x. > > Testing: > - s390x: > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR SKIP > jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java > 0 0 0 0 0 > jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java > 0 0 0 0 0 > jtreg:test/jdk/jdk/jfr 630 577 0 0 53 > ============================== > TEST SUCCESS > > > - amd64: > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR SKIP > jtreg:test/hotspot/jtreg/applications/ctw/modules/jdk_jfr.java > 1 1 0 0 0 > jtreg:test/hotspot/jtreg/compiler/intrinsics/TestReturnOopSetForJFRWriteCheckpoint.java > 1 1 0 0 0 > jtreg:test/jdk/jdk/jfr 629 622 0 0 7 > ============================== > TEST SUCCESS > I would've suggested to use `@requires vm.continuations` that way test will be disabled if continuations support is not there. Or probably we can do simple problem listing, that way it will be little easy to enable the test case again. > ... > But it seems once these stubs were implemented for other architectures the requirement for continuations support vanished. See aarch64 for example: Yes, probably problemlist is a good alternative - could you please provide a bug number that I should use? I did not do `@requires vm.continuations` since you *can* enable the feature on s390x, but since it is not implemented it will be crashing in the tests. > I see this testcase failing in tier1 tests on headstream: java/foreign/sharedclosejfr/TestSharedCloseJFR.java could you check if it's the same case for you as well ? Absolutely! I will update PR or raise an issue if it is a separate reason ------------- PR Comment: https://git.openjdk.org/jdk/pull/28444#issuecomment-3577371150 From pchilanomate at openjdk.org Tue Nov 25 20:10:56 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 20:10:56 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v6] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Remove INCLUDE_JVMTI macro ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/b54594c4..4c598ad4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Tue Nov 25 20:11:00 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 20:11:00 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> References: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> Message-ID: On Sat, 22 Nov 2025 07:52:40 GMT, Serguei Spitsyn wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/javaThread.cpp line 1173: > >> 1171: bool JavaThread::java_suspend(bool register_vthread_SR) { >> 1172: #if INCLUDE_JVMTI >> 1173: // Suspending a JavaThread in VTMS transition or disabling VTMS transitions can cause deadlocks. > > Q: I wonder if the `#if INCLUDE_JVMTI` and `#endif` can be removed here. Yes, removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561254972 From pchilanomate at openjdk.org Tue Nov 25 20:11:03 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 20:11:03 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 08:22:34 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52: >> >>> 50: // parameter is_SR: suspender or resumer >>> 51: MountUnmountDisabler(bool exlusive = false); >>> 52: MountUnmountDisabler(oop thread_oop); >> >> What does the comment mean here? > > This comment is stale now and must be removed. The parameter `is_SR` is being replaced with the `exclusive`. Right, removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561250374 From vpaprotski at openjdk.org Tue Nov 25 20:12:26 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Tue, 25 Nov 2025 20:12:26 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v7] In-Reply-To: References: Message-ID: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: comments from Jatin ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28136/files - new: https://git.openjdk.org/jdk/pull/28136/files/bfc16f1f..094051e0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28136&range=05-06 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28136.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28136/head:pull/28136 PR: https://git.openjdk.org/jdk/pull/28136 From vpaprotski at openjdk.org Tue Nov 25 20:12:28 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Tue, 25 Nov 2025 20:12:28 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v6] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 22:01:17 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > spelling Thanks for the review Jatin. Re links/references.. This was original work, apart from the base from Ferenc.. I did have a look at the original reference from IBM but Ferenc's multiply was already better. ------------- PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3505809840 From sviswanathan at openjdk.org Tue Nov 25 20:12:29 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 25 Nov 2025 20:12:29 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v7] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 20:09:36 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > comments from Jatin Marked as reviewed by sviswanathan (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3506642227 From vpaprotski at openjdk.org Tue Nov 25 20:12:34 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Tue, 25 Nov 2025 20:12:34 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v6] In-Reply-To: <7-u4fTT6SMiqErNn-Xl7o8UTVF2NIV5m0DAhStsbsk0=.5f51025e-8ed8-4d2f-911c-1257b272f9f7@github.com> References: <7-u4fTT6SMiqErNn-Xl7o8UTVF2NIV5m0DAhStsbsk0=.5f51025e-8ed8-4d2f-911c-1257b272f9f7@github.com> Message-ID: On Tue, 25 Nov 2025 02:50:41 GMT, Jatin Bhateja wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> spelling > > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 365: > >> 363: >> 364: static void loadXmms(const XMMRegister destinationRegs[], Register source, int offset, >> 365: int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { > > Suggestion: > > int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 381: > >> 379: >> 380: static void storeXmms(Register destination, int offset, const XMMRegister xmmRegs[], >> 381: int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { > > Suggestion: > > int vector_len, MacroAssembler *_masm, int regCnt = -1, int memStep = -1) { done > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 659: > >> 657: // zetas (int[128*8]) = c_rarg1 >> 658: static address generate_dilithiumAlmostInverseNtt_avx(StubGenerator *stubgen, >> 659: int vector_len,MacroAssembler *_masm) { > > Fix indentation I dont think this is any better: static address generate_dilithiumAlmostInverseNtt_avx(StubGenerator *stubgen, int vector_len, MacroAssembler *_masm) { I prefer more lines on the screen instead. I also didn't see anything in hotspot-style.md specifically on function declaration style so figure it is up to me. Did add a space after the coma. > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 718: > >> 716: >> 717: // Constants for shuffle and montMul64 >> 718: __ mov64(scratch, 0b1010101010101010); > > 64 bit constant suffix Note the `0b` prefix. `0b0000000000000000000000000000000000000000000000000101010101010101UL` is worse. And the very next line is using the constant as a 16bit value > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 901: > >> 899: // poly2 (int[256]) = c_rarg2 >> 900: static address generate_dilithiumNttMult_avx(StubGenerator *stubgen, >> 901: int vector_len, MacroAssembler *_masm) { > > Fix indentation as above > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 939: > >> 937: vector_len, scratch); // 2^64 mod q >> 938: if (vector_len == Assembler::AVX_512bit) { >> 939: __ mov64(scratch, 0b0101010101010101); > > Add long constant suffix as above > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 985: > >> 983: // constant (int) = c_rarg1 >> 984: static address generate_dilithiumMontMulByConstant_avx(StubGenerator *stubgen, >> 985: int vector_len, MacroAssembler *_masm) { > > Fix indentation as above > src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 1026: > >> 1024: __ evpbroadcastd(constant, rConstant, Assembler::AVX_512bit); // constant multiplier >> 1025: >> 1026: __ mov64(scratch, 0b0101010101010101); //dw-mask > > Constant suffix as above ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2561127351 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2561128486 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2560573897 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2560581463 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2560583718 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2560585705 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2560635291 PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2561171198 From dlong at openjdk.org Tue Nov 25 20:19:27 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 25 Nov 2025 20:19:27 GMT Subject: RFR: 8371306: JDK-8367002 behavior might not match existing HotSpot behavior. Message-ID: <6Rlv3LqIljnMFAoDX1kgGT1hWCVGgU-UNS-UOYIpNrU=.760472eb-f9ab-415f-8618-e596b0d50c7b@github.com> When deoptimizing to the interpreter, we need to restore the thrown exception to the original, otherwise it might be caught by the wrong handler. In the test case, that means restoring the original ArithmeticException instead of keeping the new/recursive IllegalAccessError. ------------- Commit messages: - Merge branch 'openjdk:master' into 8371306 - expose regression from past behavior - restore original exception Changes: https://git.openjdk.org/jdk/pull/28497/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28497&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371306 Stats: 13 lines in 3 files changed: 10 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28497.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28497/head:pull/28497 PR: https://git.openjdk.org/jdk/pull/28497 From amenkov at openjdk.org Tue Nov 25 20:36:15 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 25 Nov 2025 20:36:15 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v3] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 19:45:04 GMT, Serguei Spitsyn wrote: >> This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame. >> >> This fix is to avoid enforcing the `interp-only` execution mode for threads when `FramePop` events are enabled with the JVMTI `SetEventNotificationMode()`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` optimization by the function `InterpreterRuntime::frequency_counter_overflow_inner()`. (Big thanks to @fisk for this suggestion!) Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked. >> The other details will be provided in the first PR request comment. >> It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed). >> >> Testing: >> - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage >> - submitted mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: fix typo in a EATests.java comment src/hotspot/share/prims/jvmtiThreadState.cpp line 707: > 705: for (int idx = 0; idx < deopts->length(); idx++) { > 706: int frame_number = deopts->at(idx); > 707: deopts->remove_at(idx); The code forward iterates the array removing the entries? it will skip every other element (indexes change after removal) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28407#discussion_r2561326190 From fandreuzzi at openjdk.org Tue Nov 25 21:12:48 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 25 Nov 2025 21:12:48 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed [v2] In-Reply-To: References: Message-ID: > The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: nn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28467/files - new: https://git.openjdk.org/jdk/pull/28467/files/bd6d8ff5..7c3a2e89 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28467&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28467&range=00-01 Stats: 20 lines in 1 file changed: 0 ins; 20 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28467/head:pull/28467 PR: https://git.openjdk.org/jdk/pull/28467 From fandreuzzi at openjdk.org Tue Nov 25 21:12:49 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 25 Nov 2025 21:12:49 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: <4NUT06_oTs40Z3f3IvH01FlPi-voEdgWjgepam5X9O8=.c8fd3a54-5953-455e-b4de-6d46ccda0888@github.com> On Tue, 25 Nov 2025 16:59:18 GMT, Erik Gahlin wrote: > Looks like getValue() is never used. Can it be removed, and the reflection code? Thanks, it can be removed indeed. > Do we really need the fields to be volatile? My reasoning there is that the strings I create should be used somehow to make sure no unpredictable optimizations tamper with the test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3577594916 From egahlin at openjdk.org Tue Nov 25 22:02:33 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 25 Nov 2025 22:02:33 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: <4NUT06_oTs40Z3f3IvH01FlPi-voEdgWjgepam5X9O8=.c8fd3a54-5953-455e-b4de-6d46ccda0888@github.com> References: <4NUT06_oTs40Z3f3IvH01FlPi-voEdgWjgepam5X9O8=.c8fd3a54-5953-455e-b4de-6d46ccda0888@github.com> Message-ID: On Tue, 25 Nov 2025 21:07:24 GMT, Francesco Andreuzzi wrote: > My reasoning there is that the strings I create should be used somehow to make sure no unpredictable optimizations tamper with the test. We have made fields public in other JFR tests, but volatile works as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3577750629 From egahlin at openjdk.org Tue Nov 25 22:11:16 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 25 Nov 2025 22:11:16 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed [v2] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 21:12:48 GMT, Francesco Andreuzzi wrote: >> The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > nn Marked as reviewed by egahlin (Reviewer). I can sponsor this. ------------- PR Review: https://git.openjdk.org/jdk/pull/28467#pullrequestreview-3507061146 PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3577782859 From dhanalla at openjdk.org Tue Nov 25 22:23:36 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Tue, 25 Nov 2025 22:23:36 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 [v2] In-Reply-To: References: Message-ID: > This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. > Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). > > > > The micro-benchmark results from MathBench and StrictMathBench below show the performance improvement of Math.log: > > > **Before change** > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" > xmlns="http://www.w3.org/TR/REC-html40"> > > > > > > > > > > >
    > >
    > >
    > >
    > > Benchmark | Mode | Cnt | Score | Error | Units > -- | -- | -- | -- | -- | -- > MathBench.logDouble | thrpt | 10 | **15549.705** | ?357.439 | ops/ms > StrictMathBench.logDouble | thrpt | 10 | 219408.158 | ?16484.680 | ops/ms > >
    > >
    > >
    > >
    > > > > > > > > > **After adding Math.log intrinsic** > > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" > xmlns="http://www.w3.org/TR/REC-html40"> > > > > > > > > > > >
    > >
    > >
    > >
    > > Benchmark | Mode | Cnt | Score | Error | Units > -- | -- | -- | -- | -- | -- > MathBench.logDouble | thrpt | 10 | **300086.773** | ?6675.936 | ops/ms > StrictMathBench.logDouble | thrpt | 10 | 226521.817 | ?4038.975 | ops/ms > > >
    > >
    > >
    > >
    > > > > > Dhamoder Nalla has updated the pull request incrementally with one additional commit since the last revision: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28306/files - new: https://git.openjdk.org/jdk/pull/28306/files/e8fac776..06b3dd4d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28306&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28306&range=00-01 Stats: 184 lines in 4 files changed: 66 ins; 114 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/28306.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28306/head:pull/28306 PR: https://git.openjdk.org/jdk/pull/28306 From vpaprotski at openjdk.org Tue Nov 25 22:45:57 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Tue, 25 Nov 2025 22:45:57 GMT Subject: Integrated: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 16:38:49 GMT, Volodymyr Paprotski wrote: > - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline > - `SignatureBench.MLDSA` is 1.2x-2.2x faster > - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) > - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version > - `SignatureBench.MLDSA` is upto 5% faster, never slower > > Note on intrinsic: > - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. > - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 > > Tests and benchmarks: > - Added a fuzz test to ensure Java and intrinsic produces exactly same result > - Added benchmark to measure the performance of intrinsic itself > > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" > make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" > make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" This pull request has now been integrated. Changeset: b36b6947 Author: Volodymyr Paprotski URL: https://git.openjdk.org/jdk/commit/b36b69470968b1578877cfe9658892a5fe44e38e Stats: 1827 lines in 6 files changed: 1124 ins; 255 del; 448 mod 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements Reviewed-by: sviswanathan, mpowers, ascarpino ------------- PR: https://git.openjdk.org/jdk/pull/28136 From duke at openjdk.org Tue Nov 25 22:53:17 2025 From: duke at openjdk.org (duke) Date: Tue, 25 Nov 2025 22:53:17 GMT Subject: RFR: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed [v2] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 21:12:48 GMT, Francesco Andreuzzi wrote: >> The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > nn @fandreuz Your change (at version 7c3a2e89c77b633898dcaf3f691e0d83ab77a698) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28467#issuecomment-3577922529 From sspitsyn at openjdk.org Tue Nov 25 22:57:29 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 25 Nov 2025 22:57:29 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v4] In-Reply-To: References: Message-ID: > This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame. > > This fix is to avoid enforcing the `interp-only` execution mode for threads when `FramePop` events are enabled with the JVMTI `SetEventNotificationMode()`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` optimization by the function `InterpreterRuntime::frequency_counter_overflow_inner()`. (Big thanks to @fisk for this suggestion!) Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked. > The other details will be provided in the first PR request comment. > It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed). > > Testing: > - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage > - submitted mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: fix iteration order in process_vthread_pending_deopts as it uses remove_at(idx) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28407/files - new: https://git.openjdk.org/jdk/pull/28407/files/5989906c..1224d1e2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28407.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28407/head:pull/28407 PR: https://git.openjdk.org/jdk/pull/28407 From sspitsyn at openjdk.org Tue Nov 25 22:57:32 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 25 Nov 2025 22:57:32 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v3] In-Reply-To: References: Message-ID: <1_e88Ls6a_9rmKaG2GzfU25i0IibBZrWVKILTznSLdI=.ad699183-7c78-4110-9212-7e11bdfa1332@github.com> On Tue, 25 Nov 2025 20:33:48 GMT, Alex Menkov wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: fix typo in a EATests.java comment > > src/hotspot/share/prims/jvmtiThreadState.cpp line 707: > >> 705: for (int idx = 0; idx < deopts->length(); idx++) { >> 706: int frame_number = deopts->at(idx); >> 707: deopts->remove_at(idx); > > The code forward iterates the array removing the entries? it will skip every other element (indexes change after removal) Nice catch, thanks! It is easy to forget about it. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28407#discussion_r2561806700 From pchilanomate at openjdk.org Tue Nov 25 22:59:59 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 25 Nov 2025 22:59:59 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: keep preexisting rebind order for mount ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/4c598ad4..dee2b843 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=05-06 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From coleenp at openjdk.org Wed Nov 26 00:01:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:00 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: References: Message-ID: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> On Tue, 25 Nov 2025 22:59:59 GMT, Patricio Chilano Mateo wrote: >> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. >> >> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: >> >> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. >> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. >> >> - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. >> >> - The code was previously structured in t... > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > keep preexisting rebind order for mount First read through, mostly questions and plea for comments. This is a nice refactoring and cleanup of some very difficult code. You don't have to do the renaming that I requested now if you don't want to. src/hotspot/share/classfile/javaClasses.cpp line 1688: > 1686: int java_lang_Thread::_jvmti_thread_state_offset; > 1687: int java_lang_Thread::_VTMS_transition_disable_count_offset; > 1688: int java_lang_Thread::_is_in_VTMS_transition_offset; Since you're renaming these anyway, can we drop the VTMS part? Just call it vthread_transition_disable_count_offset and is_in_vthread_transition_offset? There are other VTMS named things that aren't these flags but they can stay. Maybe migrate other names at some future point. src/hotspot/share/opto/library_call.cpp line 3046: > 3044: } > 3045: > 3046: bool LibraryCallKit::inline_native_vthread_start_transition(address funcAddr, const char* funcName, bool is_final_transition) { Would it be helpful to add a comment above this to say what this does? This is supposed to match some non-intrinsic code and might be helpful if you referenced that here. src/hotspot/share/prims/jvm.cpp line 3671: > 3669: > 3670: JVM_ENTRY(void, JVM_VirtualThreadStartFinalTransition(JNIEnv* env, jobject vthread)) > 3671: oop vt = JNIHandles::resolve_external_guard(vthread); Why do the opto runtime versions set is_in_VTMTS_transition in both the java.lang.Thread and JavaThread and these don't? src/hotspot/share/prims/jvmtiEnv.cpp line 1827: > 1825: JvmtiEnv::ClearAllFramePops(jthread thread) { > 1826: ResourceMark rm; > 1827: MountUnmountDisabler disabler(thread); Not for this change but I thought JVMTI had some xml code that generated prefixes for these functions. This seems like something that could be unified somewhere tbd. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1772: > 1770: > 1771: assert(java_thread != nullptr, "sanity check"); > 1772: assert(!java_thread->is_in_VTMS_transition(), "sanity check"); Why don't you need these asserts anymore? src/hotspot/share/runtime/javaThread.cpp line 1152: > 1150: bool JavaThread::is_in_VTMS_transition() const { > 1151: return AtomicAccess::load(&_is_in_VTMS_transition); > 1152: } Is the JavaThread version always the same as the java_lang_Thread::is_in_VTMS_transition(threadOop()) value? src/hotspot/share/runtime/mountUnmountDisabler.hpp line 34: > 32: > 33: class MountUnmountDisabler : public AnyObj { > 34: static volatile int _global_start_transition_disable_count; Can you describe this variable - when is it set and why is there a global disabler? What does it mean to have 'n' active disablers? A comment at the beginning of MountUnmountDisabler to say something of the effect that during virtual thread mounting and unmounting, JVMTI and operations that need to examine thread state need to be disabled. Or is it the converse? During JVMTI and operations that examine the state of threads, virtual thread mounting and unmounting must wait until these operations are complete. This class is for the latter right? src/hotspot/share/runtime/mutexLocker.cpp line 52: > 50: Mutex* JvmtiThreadState_lock = nullptr; > 51: Monitor* EscapeBarrier_lock = nullptr; > 52: Monitor* VTMSTransition_lock = nullptr; oh you could drop the name VTMS and call it VThreadTransitionLock can't you? ------------- PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3507302896 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561864174 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561876549 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561897865 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561904709 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561910057 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561926510 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561943253 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561945493 From coleenp at openjdk.org Wed Nov 26 00:01:01 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:01 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: On Tue, 25 Nov 2025 23:32:40 GMT, Coleen Phillimore wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> keep preexisting rebind order for mount > > src/hotspot/share/runtime/javaThread.cpp line 1152: > >> 1150: bool JavaThread::is_in_VTMS_transition() const { >> 1151: return AtomicAccess::load(&_is_in_VTMS_transition); >> 1152: } > > Is the JavaThread version always the same as the java_lang_Thread::is_in_VTMS_transition(threadOop()) value? Why is there the same flag with the same name in both the Java class and C++ JavaThread? Might be an efficient cache, so something should say that (if true). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561962218 From coleenp at openjdk.org Wed Nov 26 00:01:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:03 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> References: <6WDKeD8iQTXDPhR0ohPvA6KVufW0uXHBmyZ5oOWfYWI=.44d63e7f-fbdb-47b6-8b36-f0d0cb35fb91@github.com> Message-ID: On Sat, 22 Nov 2025 08:43:07 GMT, Serguei Spitsyn wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename VM methods for endFirstTransition/startFinalTransition > > src/hotspot/share/runtime/mountUnmountDisabler.cpp line 126: > >> 124: || global_start_transition_disable_count() > base_disable_count >> 125: JVMTI_ONLY(|| (JvmtiVTSuspender::is_vthread_suspended(java_lang_Thread::thread_id(vthread)) || thread->is_suspended())); >> 126: } > > I like this approach with the JVMTIStartTransition and JVMTIEndTransition helper classes. It is a nice way to decouple the JVMTI part of the protocol. Introducing the `is_start_transition_disabled()` function was also long desired. Also, I like the functions `start_transition()` and `end_transition()` became pretty simple and clean! This is the function that needs a comment why you're testing all these things (and why base_disable_count is one for JVMTI). It's nice as a function that tests all the different values. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561977191 From coleenp at openjdk.org Wed Nov 26 00:01:06 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 00:01:06 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: References: Message-ID: <6kxaoFZTU2CYGKZpONDliyxGikpxbLMaxUtuqENnlq4=.4e48b44a-522f-4568-b4da-96b0184e5afc@github.com> On Tue, 25 Nov 2025 19:50:06 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307: >> >>> 305: // Block while some mount/unmount transitions are in progress. >>> 306: // Debug version fails and prints diagnostic information. >>> 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) { >> >> This looks very odd, having an assignment in the loop condition check and no actual loop-update expression. > > Yes, from what I see this same construct is used in many places. Seems this is valid because a pointer used in a boolean context evaluates to false if nullptr and true if non-null. :) This could be a simple cleanup of all these occurrences later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2561991623 From sparasa at openjdk.org Wed Nov 26 00:20:16 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Wed, 26 Nov 2025 00:20:16 GMT Subject: RFR: 8349452: Fix performance regression for Arrays.fill() with AVX512 [v9] In-Reply-To: References: Message-ID: > The goal of this PR is to fix the performance regression in Arrays.fill() x86 stubs caused by masked AVX stores. The fix is to replace the masked AVX stores with store instructions without masks (i.e. unmasked stores). `fill32_masked()` and `fill64_masked()` stubs are replaced with `fill32_unmasked()` and `fill64_unmasked()` respectively. > > To speedup unmasked stores, array fills for sizes < 64 bytes are broken down into sequences of 32B, 16B, 8B, 4B, 2B and 1B stores, depending on the size. > > > ### **Performance comparison for byte array fills in a loop for 1 million times** > > > UseAVX=3 ByteArray Size | +OptimizeFill (Masked store stub) [secs] | -OptimizeFill (No stub) [secs] | --->This PR: +OptimizeFill (Unmasked store stub) [secs] > -- | -- | -- | -- > 1 | 0.46 | 0.14 | 0.185 > 2 | 0.46 | 0.16 | 0.195 > 3 | 0.46 | 0.176 | 0.199 > 4 | 0.46 | 0.244 | 0.207 > 5 | 0.46 | 0.29 | 0.32 > 10 | 0.46 | 0.58 | 0.303 > 15 | 0.46 | 0.42 | 0.271 > 16 | 0.46 | 0.46 | 0.32 > 17 | 0.21 | 0.5 | 0.299 > 20 | 0.21 | 0.37 | 0.299 > 25 | 0.21 | 0.59 | 0.282 > 31 | 0.21 | 0.53 | 0.273 > 32 | 0.21 | 0.58 | 0.199 > 35 | 0.5 | 0.77 | 0.259 > 40 | 0.5 | 0.61 | 0.33 > 45 | 0.5 | 0.52 | 0.281 > 48 | 0.5 | 0.66 | 0.32 > 49 | 0.22 | 0.69 | 0.3 > 50 | 0.22 | 0.78 | 0.3 > 55 | 0.22 | 0.67 | 0.292 > 60 | 0.22 | 0.67 | 0.3293 > 64 | 0.22 | 0.82 | 0.23 > 70 | 0.51 | 1.1 | 0.34 > 80 | 0.49 | 0.89 | 0.365 > 90 | 0.225 | 0.68 | 0.33 > 100 | 0.54 | 1.09 | 0.347 > 110 | 0.6 | 0.98 | 0.36 > 120 | 0.26 | 0.75 | 0.386 > 128 | 0.266 | 1.1 | 0.289 Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: fix missing array length updates for size=1 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28442/files - new: https://git.openjdk.org/jdk/pull/28442/files/b047ac84..d3724b88 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28442&range=07-08 Stats: 5 lines in 1 file changed: 2 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28442.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28442/head:pull/28442 PR: https://git.openjdk.org/jdk/pull/28442 From dhanalla at openjdk.org Wed Nov 26 00:54:53 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Wed, 26 Nov 2025 00:54:53 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 [v2] In-Reply-To: <9s7p49-bYB_amcD0q2XuEpeNPy4Ud1p38pE-UHEyB7c=.b9ef99f0-2381-4a83-856a-bc0dd273f4f1@github.com> References: <9s7p49-bYB_amcD0q2XuEpeNPy4Ud1p38pE-UHEyB7c=.b9ef99f0-2381-4a83-856a-bc0dd273f4f1@github.com> Message-ID: On Tue, 18 Nov 2025 20:55:12 GMT, Emanuel Peter wrote: > Drive-by, cannot promise a full review. But I'm interested ;) > > Mostly, I have questions about testing. Are there already tests for accuracy somewhere? > > Do you have any benchmark results to support this PR? It would be good if we had a way to prove that performance is good for all sorts of inputs. I suppose we don't have any loops here, so we should just make sure to benchmark cases so that all possible paths of the intrinsic are covered, right? Thanks @eme64, Updated the PR description with MathBench.logDouble results. > test/jdk/java/lang/Math/TestLogMonotonicity.java line 61: > >> 59: // Powers of two 2^1 .. 2^16 >> 60: for (int i = 1; i <= 16; i++) { >> 61: list.add(Math.pow(2.0, i)); > > It seems you now only cover powers of 2, right? Is this sufficient? I don't know what other tests already exist, so maybe this is already covered elsewhere? Updated the existing LogTests.java with similar monotonicity test in Log1pTests.java. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28306#issuecomment-3578261545 PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2562201697 From dhanalla at openjdk.org Wed Nov 26 00:54:55 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Wed, 26 Nov 2025 00:54:55 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 [v2] In-Reply-To: References: <9s7p49-bYB_amcD0q2XuEpeNPy4Ud1p38pE-UHEyB7c=.b9ef99f0-2381-4a83-856a-bc0dd273f4f1@github.com> Message-ID: On Tue, 18 Nov 2025 20:51:57 GMT, Emanuel Peter wrote: >> test/jdk/java/lang/Math/TestLogMinValue.java line 28: >> >>> 26: * @bug 8308776 >>> 27: * @build Tests >>> 28: * @summary Compare Math.log and StrictMath.log for Double.MIN_VALUE (denormal smallest positive) to ensure consistency. >> >> Are there tests that check for consistency of the other values? > > Do we have tests that already check for sufficient accuracy? Updated the existing LogTests with these new test scenarios. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2562194013 From dhanalla at openjdk.org Wed Nov 26 00:54:57 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Wed, 26 Nov 2025 00:54:57 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 [v2] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 21:18:55 GMT, Joe Darcy wrote: >> Dhamoder Nalla has updated the pull request incrementally with one additional commit since the last revision: >> >> [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 > > test/jdk/java/lang/Math/TestLogMonotonicity.java line 29: > >> 27: * @run main TestLogMonotonicity >> 28: */ >> 29: public class TestLogMonotonicity { > > So the test is checking for monotonicity over value that are 2X the previous value? > > That is a very weak test. > > In other math library regression tests we test for monotonicity on successive values. Thanks @jddarcy, Updated the existing LogTests.java with similar monotonicity test in Log1pTests.java. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2562205428 From dholmes at openjdk.org Wed Nov 26 00:57:59 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Nov 2025 00:57:59 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v7] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 20:12:26 GMT, Volodymyr Paprotski wrote: >> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline >> - `SignatureBench.MLDSA` is 1.2x-2.2x faster >> - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7) >> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version >> - `SignatureBench.MLDSA` is upto 5% faster, never slower >> >> Note on intrinsic: >> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill. >> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2 >> >> Tests and benchmarks: >> - Added a fuzz test to ensure Java and intrinsic produces exactly same result >> - Added benchmark to measure the performance of intrinsic itself >> >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" >> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1" >> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1" > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > comments from Jatin The new test can only run on x86 but it is not restricted to x86, thus it fails when run on Aarch64. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3578268951 From fandreuzzi at openjdk.org Wed Nov 26 01:22:03 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Wed, 26 Nov 2025 01:22:03 GMT Subject: Integrated: 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 12:49:44 GMT, Francesco Andreuzzi wrote: > The assertion is used to validate a precondition for the test. As long as the deduplication happens inside the `RecordingStream` scope, a `StringDeduplication ` event will be recorded. Thus the assertion is not needed and can be removed. This pull request has now been integrated. Changeset: d9b6c314 Author: Francesco Andreuzzi Committer: Erik Gahlin URL: https://git.openjdk.org/jdk/commit/d9b6c314872ee626c725d119023179ae93639f54 Stats: 22 lines in 1 file changed: 0 ins; 18 del; 4 mod 8372324: jdk/jfr/event/gc/detailed/TestStringDeduplicationEvent.java#Parallel failed Reviewed-by: egahlin, mbaesken, ayang ------------- PR: https://git.openjdk.org/jdk/pull/28467 From lmesnik at openjdk.org Wed Nov 26 04:48:17 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 26 Nov 2025 04:48:17 GMT Subject: RFR: 8372552: unhandled oop in the JvmtiEventController::set_user_enabled Message-ID: The issue reproduced by running test vmTestbase/nsk/jvmti/AttachOnDemand/attach022[1]/TestDescription.java with `-XX:+CheckUnhandledOops`. No need to flush object free events during VM init. So it is fine to move it after handling the oop. Testing with tier1-5. ------------- Commit messages: - fixed handling Changes: https://git.openjdk.org/jdk/pull/28500/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28500&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372552 Stats: 10 lines in 1 file changed: 5 ins; 4 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28500.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28500/head:pull/28500 PR: https://git.openjdk.org/jdk/pull/28500 From sspitsyn at openjdk.org Wed Nov 26 05:05:35 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 26 Nov 2025 05:05:35 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v5] In-Reply-To: References: Message-ID: <5i03hZ66IqItuQfklwFP7CxI-LiKvLJ8iOTSLoIhrbo=.02aa02b4-99f3-49f1-976b-2c89781e9199@github.com> > This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame. > > This fix is to avoid enforcing the `interp-only` execution mode for threads when `FramePop` events are enabled with the JVMTI `SetEventNotificationMode()`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` optimization by the function `InterpreterRuntime::frequency_counter_overflow_inner()`. (Big thanks to @fisk for this suggestion!) Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked. > The other details will be provided in the first PR request comment. > It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed). > > Testing: > - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage > - submitted mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: renamed two functions; FRAME_POP_BIT removed from INTERP_EVENT_BITS ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28407/files - new: https://git.openjdk.org/jdk/pull/28407/files/1224d1e2..d82e4efe Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28407&range=03-04 Stats: 10 lines in 4 files changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/28407.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28407/head:pull/28407 PR: https://git.openjdk.org/jdk/pull/28407 From sspitsyn at openjdk.org Wed Nov 26 05:30:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 26 Nov 2025 05:30:49 GMT Subject: RFR: 6960970: Debugger very slow during stepping [v5] In-Reply-To: <5i03hZ66IqItuQfklwFP7CxI-LiKvLJ8iOTSLoIhrbo=.02aa02b4-99f3-49f1-976b-2c89781e9199@github.com> References: <5i03hZ66IqItuQfklwFP7CxI-LiKvLJ8iOTSLoIhrbo=.02aa02b4-99f3-49f1-976b-2c89781e9199@github.com> Message-ID: <2fv251w2Pu7JpU4LNTMbqqNKVf2RL7q-_bJv7eiWMOU=.3f57aac0-58dc-4f1d-9e65-d13b448e58b8@github.com> On Wed, 26 Nov 2025 05:05:35 GMT, Serguei Spitsyn wrote: >> This change fixes a long standing performance issue related to the debugger single stepping that is using JVMTI `FramePop` events as a part of step over handling. The performance issue is that the target thread continues its execution in very slow `interp-only` mode in a context of frame marked for `FramePop` notification with the JVMTI `NotifyFramePop`. It includes other method calls recursively upon a return from the frame. >> >> This fix is to avoid enforcing the `interp-only` execution mode for threads when `FramePop` events are enabled with the JVMTI `SetEventNotificationMode()`. Instead, the target frame has been deoptimized and kept interpreted by disabling `OSR` optimization by the function `InterpreterRuntime::frequency_counter_overflow_inner()`. (Big thanks to @fisk for this suggestion!) Additionally, some tweaks are applied in several places where the `java_thread->is_interp_only_mode()` is checked. >> The other details will be provided in the first PR request comment. >> It is considered to file a SCR for this update a `FramePop` events do not enforce the `interp-only` mode for a target thread anymore which might break some expectations (the behavior has been changed). >> >> Testing: >> - test `serviceability/jvmti/vthread/ThreadStateTest` was updated to provide some extra test coverage >> - submitted mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: renamed two functions; FRAME_POP_BIT removed from INTERP_EVENT_BITS We had a walk through the changes with @alexmenkov , @plummercj and @lmesnik and identified several issues: - New function `JvmtiExport::has_frame_pop_for_top_frame()` needs a minor performance tweak - The changes in `post_method_exit()` are not as precise and correct as needed: - call to `get_jvmti_thread_state()` may create a `JvmtiThreadState` object in a case when it is not needed - there can be some unreasonable performance overhead - there is a concern about possible incorrect handling of the `cur_stack_depth` (need to double check) - Need to remove the `FRAME_POP_BIT` from the`INTERP_EVENT_BITS` bit mask (**DONE**) - Decided to rename a couple of new functions (**DONE**): - s/`check_and_clear_vthread_pending_deopts`/`clear_vthread_pending_deopts`/g - s/`get_vthread_pending_deopts`/`vthread_pending_deopts`/g - Need some additional test coverage: - for multiple `FramePop` requests handled by `process_vthread_pending_deopts()` - a test showing a performance improvement with this PR update ------------- PR Comment: https://git.openjdk.org/jdk/pull/28407#issuecomment-3579240863 From amitkumar at openjdk.org Wed Nov 26 07:22:48 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 26 Nov 2025 07:22:48 GMT Subject: RFR: 8352567: [s390x] disable JFR tests requiring JFR stubs In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 20:01:19 GMT, Vladimir Petko wrote: > I did not do `@requires vm.continuations` since you can enable the feature on s390x, but since it is not implemented it will be crashing in the tests. Yeah that makes sense. > I will update PR or raise an issue if it is a separate reason I guess updating this PR and issue is good enough. We can even make it subtask of [JDK-8286300](https://bugs.openjdk.org/browse/JDK-8286300), that would make things easier to keep track of disabled testcases. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28444#issuecomment-3579780224 From dholmes at openjdk.org Wed Nov 26 07:32:52 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Nov 2025 07:32:52 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> References: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> Message-ID: On Tue, 25 Nov 2025 19:45:08 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/classfile/javaClasses.cpp line 1764: >> >>> 1762: jint* addr = java_thread->field_addr(_VTMS_transition_disable_count_offset); >>> 1763: int val = AtomicAccess::load(addr); >>> 1764: AtomicAccess::store(addr, val - 1); >> >> Suggestion: >> >> AtomicAccess::dec(addr); > > I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. But it isn't then an atomic update. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2563699480 From dholmes at openjdk.org Wed Nov 26 07:35:49 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Nov 2025 07:35:49 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4] In-Reply-To: <6kxaoFZTU2CYGKZpONDliyxGikpxbLMaxUtuqENnlq4=.4e48b44a-522f-4568-b4da-96b0184e5afc@github.com> References: <6kxaoFZTU2CYGKZpONDliyxGikpxbLMaxUtuqENnlq4=.4e48b44a-522f-4568-b4da-96b0184e5afc@github.com> Message-ID: On Tue, 25 Nov 2025 23:53:40 GMT, Coleen Phillimore wrote: >> Yes, from what I see this same construct is used in many places. Seems this is valid because a pointer used in a boolean context evaluates to false if nullptr and true if non-null. :) > > This could be a simple cleanup of all these occurrences later. Yes this is terribly obscure (doing the assignment in the loop condition check - surprised that is even allowed) and also violates the style-guide in relation to implicit booleans. But frankly it is an awful use of a for-loop in my opinion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2563714065 From aboldtch at openjdk.org Wed Nov 26 07:57:04 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 26 Nov 2025 07:57:04 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 02:34:39 GMT, David Holmes wrote: >> There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. >> >> Testing: >> - manual inspection of hs_err file, for different GCs >> - tiers 1-3 sanity >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix include order lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28470#pullrequestreview-3509494275 From shade at openjdk.org Wed Nov 26 08:03:56 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Nov 2025 08:03:56 GMT Subject: RFR: 8372285: G1: Micro-optimize x86 barrier code [v6] In-Reply-To: <6LdV5NiSfkvLTkYDsgV2jFyw43VFzOBzaGj2Enmgrnc=.b0ddbe6b-cd41-4913-867f-dec57ed79547@github.com> References: <6LdV5NiSfkvLTkYDsgV2jFyw43VFzOBzaGj2Enmgrnc=.b0ddbe6b-cd41-4913-867f-dec57ed79547@github.com> Message-ID: <43eyCtsx-zK2TdNOZluSEsNPSQCoDzVZS4JfaRYPLLI=.56c0fc37-8fdc-4bf6-9275-068b7bde12cc@github.com> On Mon, 24 Nov 2025 09:48:52 GMT, Aleksey Shipilev wrote: >> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. >> >> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Indenting was still off I am guessing Vladimir is already in holiday mode. I believe I have addressed his comments, yielding to Vladimir's suggestions. Since there are other approvals, and the testing looks green, I will integrate shortly to unblock other work. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28446#issuecomment-3579960651 From stefank at openjdk.org Wed Nov 26 08:32:58 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 26 Nov 2025 08:32:58 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? Thanks, Aleksey! What do you think about the issue with an extreme user-specified value as described in: https://github.com/openjdk/jdk/pull/28492#issuecomment-3576326172 Do you want that change or not? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28492#issuecomment-3580098958 From duke at openjdk.org Wed Nov 26 08:36:09 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 26 Nov 2025 08:36:09 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 13:20:26 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: >> >> - fix assert >> - add more assert >> - rid of access.addr().type() >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Merge branch 'openjdk:master' into 8344116 >> - Fix build >> - Fix test failed >> - 8344116: C2: remove slice parameter from LoadNode::make > > Can we remove `C2AccessValuePtr` entirely and use: > > Node* _addr; > > where, currently, there's: > > C2AccessValuePtr& _addr; > > ? Hi @rwestrel , I removed C2AccessValuePtr, Could you please take a look, thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24258#issuecomment-3580115736 From duke at openjdk.org Wed Nov 26 08:36:12 2025 From: duke at openjdk.org (Zihao Lin) Date: Wed, 26 Nov 2025 08:36:12 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 16:30:09 GMT, Zihao Lin wrote: >> src/hotspot/share/opto/callnode.cpp line 1740: >> >>> 1738: Node* klass_node = in(AllocateNode::KlassNode); >>> 1739: Node* proto_adr = phase->transform(new AddPNode(klass_node, klass_node, phase->MakeConX(in_bytes(Klass::prototype_header_offset())))); >>> 1740: mark_node = LoadNode::make(*phase, control, mem, proto_adr, TypeX_X, TypeX_X->basic_type(), MemNode::unordered); >> >> We could assert that C->get_alias_index(kit->type(card_adr) == Compile::AliasIdxRaw > > Hi, I give it a try, but it failed pass the test. Is it possible the original version is wrong? > The mark word will not be `TypeRawPtr::BOTTOM`, it should equal to Klass slice index. One dump is ` 1368 AddP === _ 196 196 1367 [[ ]] Klass:precise java/util/LinkedHashMap$Entry: 0x0000000918349ca0 (java/util/Map$Entry):Constant:exact+168 *` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2563948581 From shade at openjdk.org Wed Nov 26 08:38:02 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Nov 2025 08:38:02 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? > What do you think about the issue with an extreme user-specified value as described in: [#28492 (comment)](https://github.com/openjdk/jdk/pull/28492#issuecomment-3576326172) > Do you want that change or not? Feel free to submit a bug and let Shenandoah folks handle it. This PR stands on its own, and I think it unblocks some of the pending work, so ship it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28492#issuecomment-3580126200 From roland at openjdk.org Wed Nov 26 08:38:05 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 26 Nov 2025 08:38:05 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 08:30:18 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with three additional commits since the last revision: >> >> - review >> - infinite loop in gvn fix >> - renaming > > @rwestrel Sorry I dropped the review on this one for a long time :/ > > I left quite a few comments. But on the whole I'm really happy with the direction you are taking. It's getting much clearer. I would still see some more clear explanations/comments. That way, we can make our previously implicit assumptions even more explicit :) @eme64 updated change should address your comments ------------- PR Comment: https://git.openjdk.org/jdk/pull/24575#issuecomment-3580124357 From aboldtch at openjdk.org Wed Nov 26 09:06:59 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 26 Nov 2025 09:06:59 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v6] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 17:37:38 GMT, Evgeny Astigeevich wrote: >> Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. >> >> Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: >> - Disable coherent icache. >> - Trap IC IVAU instructions. >> - Execute: >> - `tlbi vae3is, xzr` >> - `dsb sy` >> >> `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. >> >> As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: >> >> "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." >> >> This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. >> >> Changes include: >> >> * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. >> * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. >> * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. >> * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. >> >> Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) >> >> - Baseline >> >> $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1... > > Evgeny Astigeevich has updated the pull request incrementally with two additional commits since the last revision: > > - Remove redundant include > - Move ICacheInvalidationContext::pd_ to icache_linux_aarch64 Some style comments. src/hotspot/os_cpu/linux_aarch64/icache_linux_aarch64.hpp line 77: > 75: assert(((cache_info >> CTR_IDC_SHIFT) & 0x1) != 0x0, "Expect CTR_EL0.IDC to be enabled"); > 76: assert(((cache_info >> CTR_DIC_SHIFT) & 0x1) == 0x0, "Expect CTR_EL0.DIC to be disabled"); > 77: #endif Not sure if this should be `#ifndef PRODUCT` or `#ifdef ASSERT`. But regardless, `#ifndef PRODUCT` should be paired with `guarantees` and `#ifdef ASSERT` should be paired with `asserts`. src/hotspot/share/gc/z/zGeneration.cpp line 1439: > 1437: if (_bs_nm->is_armed(nm)) { > 1438: { > 1439: ICacheInvalidationContext icic; Style. Suggestion: ICacheInvalidationContext icic; src/hotspot/share/gc/z/zMark.cpp line 723: > 721: if (_bs_nm->is_armed(nm)) { > 722: { > 723: ICacheInvalidationContext icic; Style. Suggestion: ICacheInvalidationContext icic; src/hotspot/share/gc/z/zNMethod.cpp line 375: > 373: > 374: { > 375: ICacheInvalidationContext icic; Style. Suggestion: ICacheInvalidationContext icic; src/hotspot/share/gc/z/zUnload.cpp line 85: > 83: } > 84: ZIsUnloadingOopClosure cl(nm); > 85: ICacheInvalidationContext icic; Sytle. Could you reorder this two lines too. ```c++ ICacheInvalidationContext icic; ZIsUnloadingOopClosure cl(nm); src/hotspot/share/runtime/icache.hpp line 77: > 75: NONCOPYABLE(ICacheInvalidationContext); > 76: > 77: private: Style. Suggestion: private: NONCOPYABLE(ICacheInvalidationContext); ------------- PR Review: https://git.openjdk.org/jdk/pull/28328#pullrequestreview-3509739850 PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564006359 PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564012438 PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564014015 PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564015380 PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564028679 PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564038347 From adinn at openjdk.org Wed Nov 26 09:13:50 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Wed, 26 Nov 2025 09:13:50 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 [v2] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 22:23:36 GMT, Dhamoder Nalla wrote: >> This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. >> Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). >> >> >> >> The micro-benchmark results from MathBench and StrictMathBench below show the performance improvement of Math.log: >> >> >> **Before change** >> > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> >> >> >> >> >> >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> Benchmark | Mode | Cnt | Score | Error | Units >> -- | -- | -- | -- | -- | -- >> MathBench.logDouble | thrpt | 10 | **15549.705** | ?357.439 | ops/ms >> StrictMathBench.logDouble | thrpt | 10 | 219408.158 | ?16484.680 | ops/ms >> >>
    >> >>
    >> >>
    >> >>
    >> >> >> >> >> >> >> >> >> **After adding Math.log intrinsic** >> >> > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> >> >> >> >> >> >> >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> Benchmark | Mode | Cnt | Score | Error | Units >> -- | -- | -- | -- | -- | -- >> MathBench.logDouble | thrpt | 10 | **300086.773** | ?6675.936 | ops/ms >> StrictMathBench.logDouble | thrpt | 10 | 226521.817 | ?4038.975 | ops/ms >> >> >>
    >> >>
    >> >>
    >> >>
    >> >> >> >> >> > > Dhamoder Nalla has updated the pull request incrementally with one additional commit since the last revision: > > [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8812: > 8810: address generate_dlog() { > 8811: __ align(CodeEntryAlignment); > 8812: StubCodeMark mark(this, "StubRoutines", "dlog"); This StubCodeMark needs to be declared with a StubId as argument. See other stub generators in this file or the equivalent code in `cpu/x86/stubGenerator_x86_64_log.cpp` for an example of what it should look like. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2564133696 From kbarrett at openjdk.org Wed Nov 26 09:24:46 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 26 Nov 2025 09:24:46 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange In-Reply-To: References: Message-ID: <9WmhQ889qBk7gNvsBKMNESdi1A79MFh98EP-WMVH5tc=.d6e81a56-b9b9-46d7-89f7-8e0fdb1678e9@github.com> On Tue, 25 Nov 2025 18:28:19 GMT, Axel Boldt-Christmas wrote: > AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. > This restriction added a lot of extra logic to the Atomic implementation because > we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. > > I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. > > This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. > > _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ > > * Testing > * Extended gtest / (no other users of Atomic byte with exchange exists. > * GHA > * Running Tier 1-5 on Oracle supported platforms `VM_Version::supports_cx8()` is vestigial, and is required to return true on all platforms: https://github.com/openjdk/jdk/blob/275cb9f28799081878e0a7c53ce1c0450f4e963e/src/hotspot/share/runtime/vm_version.cpp#L32 https://github.com/openjdk/jdk/blob/275cb9f28799081878e0a7c53ce1c0450f4e963e/src/hotspot/share/runtime/atomicAccess.hpp#L58-L64 See https://bugs.openjdk.org/browse/JDK-8318776 "Require `supports_cx8` to always be true" I'm not sure why we haven't nuked `supports_cx8()`; maybe because nobody has collected sufficient 'tuits. @dholmes-ora might remember. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28498#issuecomment-3580388565 From stefank at openjdk.org Wed Nov 26 09:24:48 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 26 Nov 2025 09:24:48 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 18:28:19 GMT, Axel Boldt-Christmas wrote: > AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. > This restriction added a lot of extra logic to the Atomic implementation because > we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. > > I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. > > This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. > > _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ > > * Testing > * Extended gtest / (no other users of Atomic byte with exchange exists. > * GHA > * Running Tier 1-5 on Oracle supported platforms test/hotspot/gtest/runtime/test_atomic.cpp line 211: > 209: } > 210: > 211: Suggestion: ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28498#discussion_r2564193291 From eosterlund at openjdk.org Wed Nov 26 09:30:07 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 26 Nov 2025 09:30:07 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v6] In-Reply-To: References: <85dBRXdwqMOffQvXGI9J_zhfLnwZ0LrY_Wj4w0nrpbM=.88de1041-c752-49aa-8ab2-600b92f8559d@github.com> Message-ID: On Tue, 25 Nov 2025 16:41:35 GMT, Evgeny Astigeevich wrote: >> Or you mean `IC IVAU`?` > > I replaced the call of `ICache::invalidate_word()` with: > > asm volatile("dsb ish \n" > "ic ivau, xzr \n" > "isb \n" > : : : "memory"); > > > The code executed in `ICache::invalidate_word()` when all checks are done: > > dsb ish > ic ivau > dsb ish > isb > > > I use `xzr` in `ic ivau` because an address in it does not matter. The instruction is trapped and ignored. > I think we don't need the second `dsb` because we will have `dsb sy` in the trap handler. I don't know if we want to jeapordize the correctness of the JVM code based on the exact instructions that are *currently* used to mitigate this issue in the kernel. Eliding the trailing dsb ish because we know the kernel mitigation runs it, seems unnecessarily fragile to me; if the kernel comes up with some smarter and cheaper way of mitigating this in the future, using some other magic incantation, then I don't want to have a correctness issue because of that implicit assumption. Is it noticeably expensive to run the trailing dsb again? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564211770 From stefank at openjdk.org Wed Nov 26 09:30:08 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 26 Nov 2025 09:30:08 GMT Subject: RFR: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? Sounds good. Thanks again! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28492#issuecomment-3580412150 From aboldtch at openjdk.org Wed Nov 26 09:31:34 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 26 Nov 2025 09:31:34 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v2] In-Reply-To: References: Message-ID: > AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. > This restriction added a lot of extra logic to the Atomic implementation because > we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. > > I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. > > This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. > > _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ > > * Testing > * Extended gtest / (no other users of Atomic byte with exchange exists. > * GHA > * Running Tier 1-5 on Oracle supported platforms Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Update test/hotspot/gtest/runtime/test_atomic.cpp Co-authored-by: Stefan Karlsson ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28498/files - new: https://git.openjdk.org/jdk/pull/28498/files/b9b70050..31993dd8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28498&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28498&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28498.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28498/head:pull/28498 PR: https://git.openjdk.org/jdk/pull/28498 From stefank at openjdk.org Wed Nov 26 09:33:19 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 26 Nov 2025 09:33:19 GMT Subject: Integrated: 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:19:53 GMT, Stefan Karlsson wrote: > While rewriting some of the heap size initialization code we hit a corner-case where the setting of `ShenandoahMaxRegionSize` to something that isn't a power-of-2 will hit an assert in `max_heap_for_compressed_oops`. > > When running with: > > java -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahMaxRegionSize=33m -version > > > The following code: > > size_t displacement_due_to_null_page = align_up(os::vm_page_size(), > _conservative_max_heap_alignment) > > triggers: > > # assert(is_power_of_2(alignment)) failed: must be a power of 2: 34603008 > > because `_conservative_max_heap_alignment` is not a power-of-2. > > This happens because Shenandoah's `conservative_max_heap_alignment()` > returns a potentially unaligned `ShenandoahMaxRegionSize` value. > > > size_t ShenandoahArguments::conservative_max_heap_alignment() { > size_t align = ShenandoahMaxRegionSize; > if (UseLargePages) { > align = MAX2(align, os::large_page_size()); > } > return align; > } > > > I propose a small fix to adjust `align` to be a power-of-2. I've also added an earlier assert about this in `set_conservative_max_heap_alignment` and added an additional test-case in TestRegionSizeArgs.java > > WDYT, is this an OK fix for this corner-case? This pull request has now been integrated. Changeset: 5291e1c1 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/5291e1c1e1ddc19d814dbdb3a981049fe40575ea Stats: 14 lines in 3 files changed: 13 ins; 0 del; 1 mod 8372513: Shenandoah: ShenandoahMaxRegionSize can produce an unaligned heap alignment Reviewed-by: jsikstro, eosterlund, shade ------------- PR: https://git.openjdk.org/jdk/pull/28492 From jsikstro at openjdk.org Wed Nov 26 09:37:48 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 26 Nov 2025 09:37:48 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v4] In-Reply-To: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: > Hello, > > Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. > > For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. > > However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. > > Testing: > * Oracle's tier1-4 > * tier1-3 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge branch 'master' into JDK-8372150_parallel_minheapsize_numa_largepages - Choose large page size based on MaxHeapSize - Revert "8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages" This reverts commit c02e08ade597193d70d1eb21036845bdd0304d51. - Revert "Albert review feedback" This reverts commit 66928d22112c1ac516e4b654c28249fdedf0dba9. - Albert review feedback - 8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages ------------- Changes: https://git.openjdk.org/jdk/pull/28394/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28394&range=03 Stats: 166 lines in 11 files changed: 79 ins; 64 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/28394.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28394/head:pull/28394 PR: https://git.openjdk.org/jdk/pull/28394 From jsikstro at openjdk.org Wed Nov 26 09:37:50 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 26 Nov 2025 09:37:50 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v3] In-Reply-To: References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: On Mon, 24 Nov 2025 11:01:21 GMT, Joel Sikstr?m wrote: >> Hello, >> >> Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. >> >> For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. >> >> However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. >> >> Testing: >> * Oracle's tier1-4 >> * tier1-3 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` > > Joel Sikstr?m has updated the pull request incrementally with three additional commits since the last revision: > > - Choose large page size based on MaxHeapSize > - Revert "8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages" > > This reverts commit c02e08ade597193d70d1eb21036845bdd0304d51. > - Revert "Albert review feedback" > > This reverts commit 66928d22112c1ac516e4b654c28249fdedf0dba9. I merged with master after https://github.com/openjdk/jdk/pull/28492 is now integrated to get Shenandoah working with the changes in this patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28394#issuecomment-3580435943 From kbarrett at openjdk.org Wed Nov 26 09:41:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 26 Nov 2025 09:41:59 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v2] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 09:31:34 GMT, Axel Boldt-Christmas wrote: >> AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. >> This restriction added a lot of extra logic to the Atomic implementation because >> we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. >> >> I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. >> >> This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. >> >> _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ >> >> * Testing >> * Extended gtest / (no other users of Atomic byte with exchange exists. >> * GHA >> * Running Tier 1-5 on Oracle supported platforms > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Update test/hotspot/gtest/runtime/test_atomic.cpp > > Co-authored-by: Stefan Karlsson Mostly good. Shouldn't there also be some updates to `test_atomicAccess.cpp`? test/hotspot/gtest/runtime/test_atomic.cpp line 296: > 294: TEST_VM(AtomicEnumTest, scoped_enum_64_bit) { > 295: // Check if 64-bit atomics are available on the machine. > 296: if (!VM_Version::supports_cx8()) return; I don't think we need this check. Just assume it's true, and we'll get a link error if that ever changes, which is really not expected. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28498#pullrequestreview-3510015289 PR Review Comment: https://git.openjdk.org/jdk/pull/28498#discussion_r2564234459 From jsjolen at openjdk.org Wed Nov 26 09:42:00 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 26 Nov 2025 09:42:00 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v2] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 09:31:34 GMT, Axel Boldt-Christmas wrote: >> AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. >> This restriction added a lot of extra logic to the Atomic implementation because >> we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. >> >> I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. >> >> This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. >> >> _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ >> >> * Testing >> * Extended gtest / (no other users of Atomic byte with exchange exists. >> * GHA >> * Running Tier 1-5 on Oracle supported platforms > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Update test/hotspot/gtest/runtime/test_atomic.cpp > > Co-authored-by: Stefan Karlsson src/hotspot/cpu/ppc/atomicAccess_ppc.hpp line 162: > 160: template<> > 161: struct AtomicAccess::PlatformXchg<1> : AtomicAccess::XchgUsingCmpxchg<1> {}; > 162: What platforms are we not adding this snippet to? In other words; Can we move this to the generic code instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28498#discussion_r2564253007 From kevinw at openjdk.org Wed Nov 26 09:58:53 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 26 Nov 2025 09:58:53 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 02:34:39 GMT, David Holmes wrote: >> There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. >> >> Testing: >> - manual inspection of hs_err file, for different GCs >> - tiers 1-3 sanity >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix include order Nice. ------------- Marked as reviewed by kevinw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28470#pullrequestreview-3510124704 From kbarrett at openjdk.org Wed Nov 26 10:00:49 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 26 Nov 2025 10:00:49 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v2] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 09:38:55 GMT, Johan Sj?len wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> Update test/hotspot/gtest/runtime/test_atomic.cpp >> >> Co-authored-by: Stefan Karlsson > > src/hotspot/cpu/ppc/atomicAccess_ppc.hpp line 162: > >> 160: template<> >> 161: struct AtomicAccess::PlatformXchg<1> : AtomicAccess::XchgUsingCmpxchg<1> {}; >> 162: > > What platforms are we not adding this snippet to? In other words; Can we move this to the generic code instead? Today, none. Tomorrow, x86, and perhaps some others. (I think not arm/aarch64, I don't remember if ppc has byte atomics but I think not, and I have no idea about riscv or s390.) But there is a trick that I used for PlatformBitops that might be applied here too. See the dummy bool template parameter. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28498#discussion_r2564330309 From aboldtch at openjdk.org Wed Nov 26 10:06:53 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 26 Nov 2025 10:06:53 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v2] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 09:58:27 GMT, Kim Barrett wrote: >> src/hotspot/cpu/ppc/atomicAccess_ppc.hpp line 162: >> >>> 160: template<> >>> 161: struct AtomicAccess::PlatformXchg<1> : AtomicAccess::XchgUsingCmpxchg<1> {}; >>> 162: >> >> What platforms are we not adding this snippet to? In other words; Can we move this to the generic code instead? > > Today, none. Tomorrow, x86, and perhaps some others. (I think not arm/aarch64, > I don't remember if ppc has byte atomics but I think not, and I have no idea > about riscv or s390.) > > But there is a trick that I used for PlatformBitops that might be applied here > too. See the dummy bool template parameter. I have followup specialisations for multiple os-cpu combinations. Will create RFEs for them soon. But my aim is to have those go into JDK 27. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28498#discussion_r2564351594 From mbaesken at openjdk.org Wed Nov 26 10:08:54 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 26 Nov 2025 10:08:54 GMT Subject: RFR: 8371893: [macOS] use dead_strip linker option to reduce binary size [v5] In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: On Tue, 25 Nov 2025 13:12:06 GMT, Matthias Baesken wrote: >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : >> (before -> after setting the option) >> >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib >> 152K -> 132K images/jdk/lib/libjli.dylib >> 388K -> 296K images/jdk/lib/liblcms.dylib >> 164K -> 128K images/jdk/lib/libzip.dylib >> >> >> and libjvm : >> >> 20M -> 18M images/jdk/lib/server/libjvm.dylib >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Use dead_strip on macOS arrch64 AND x86_64 For macOS x86_64 it works too at least for some of the libs. See the size without and with dead_strip 1200 -> 1164 /jdk/lib/libawt_lwawt.dylib 1364 -> 1048 /jdk/lib/libfontmanager.dylib 636 -> 632 /jdk/lib/libfreetype.dylib 240 -> 236 /jdk/lib/libjavajpeg.dylib 280 -> 276 jdk/lib/libjdwp.dylib 92 -> 92 /jdk/lib/libjli.dylib 428 -> 320 /jdk/lib/liblcms.dylib 716 -> 688 /jdk/lib/libmlib_image.dylib 60 -> 60 /jdk/lib/libzip.dylib For some strange reasons, it does not help on libzip and libjli . But those are smaller anyway compared to aarch64. (it is a bit surprising that libzip is twice as large on aarch64 !) ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3580564402 From stuefe at openjdk.org Wed Nov 26 10:20:04 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 26 Nov 2025 10:20:04 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v4] In-Reply-To: References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: <3JxmiQCu88Zf7syywvwub2vq3AfE7Wk1AG0KRD1gV_4=.524bbaa4-1a87-4a9e-a877-d906b3767feb@github.com> On Wed, 26 Nov 2025 09:37:48 GMT, Joel Sikstr?m wrote: >> Hello, >> >> Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. >> >> For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. >> >> However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. >> >> Testing: >> * Oracle's tier1-4 >> * tier1-3 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` > > Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: > > - Merge branch 'master' into JDK-8372150_parallel_minheapsize_numa_largepages > - Choose large page size based on MaxHeapSize > - Revert "8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages" > > This reverts commit c02e08ade597193d70d1eb21036845bdd0304d51. > - Revert "Albert review feedback" > > This reverts commit 66928d22112c1ac516e4b654c28249fdedf0dba9. > - Albert review feedback > - 8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages This looks good to me (not that you need another review). A jtreg regression test would be nice, possibly in a separate RFE. What do other GCs do if heap is smaller than smallest large page size? The super-large page size issue one could also consider a user error that should result in a VM exit. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28394#pullrequestreview-3510213568 From aboldtch at openjdk.org Wed Nov 26 10:20:14 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 26 Nov 2025 10:20:14 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v3] In-Reply-To: References: Message-ID: <2eO9uSwDm-i18M0ixBHICBJmA2jHZaDRj6kzXwg6IfQ=.c54a4bb8-a8ad-40d1-baf7-c07b22565603@github.com> > AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. > This restriction added a lot of extra logic to the Atomic implementation because > we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. > > I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. > > This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. > > _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ > > * Testing > * Extended gtest / (no other users of Atomic byte with exchange exists. > * GHA > * Running Tier 1-5 on Oracle supported platforms Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Remove VM_Version::supports_cx8() conditions - Add AtomicAccessXchgTest for 1 byte ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28498/files - new: https://git.openjdk.org/jdk/pull/28498/files/31993dd8..51a1c84d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28498&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28498&range=01-02 Stats: 11 lines in 2 files changed: 5 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28498.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28498/head:pull/28498 PR: https://git.openjdk.org/jdk/pull/28498 From aboldtch at openjdk.org Wed Nov 26 10:20:16 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 26 Nov 2025 10:20:16 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v2] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 09:37:15 GMT, Kim Barrett wrote: > Shouldn't there also be some updates to test_atomicAccess.cpp? Right. You are correct. Forgot that we had AtomicAccess as well as Atomic. > I don't think we need this check. Just assume it's true, and we'll get a link error if that ever changes, which is really not expected. Removed it from the tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28498#issuecomment-3580609969 From stuefe at openjdk.org Wed Nov 26 10:20:55 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 26 Nov 2025 10:20:55 GMT Subject: RFR: 8253683: Clean up and clarify uses of os::vm_allocation_granularity In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 14:31:39 GMT, Casper Norrbin wrote: > Hi everyone, > > `os::vm_allocation_granularity()` is meant to describe the alignment restrictions of the operating system when we reserve memory. That is 64 KiB on Windows (`VirtualAlloc`) and 256 MiB on AIX (with `shmat`). On every other platform it happens to match the page size. The page size (available via `os::vm_page_size()`) is what matters when we later commit or protect the reserved pages. > > Because the functions are poorly documented and the two numbers are identical on most systems, they have gradually been used more and more interchangeably. We now have many code paths that round **sizes** up to `os::vm_allocation_granularity()` or assert that a size is a multiple of it. That is wrong. Only addresses need that alignment, sizes merely have to be page-aligned. Places that round sizes should instead use `os::vm_page_size()` as they are unrelated to attach alignment. > > For this change I have gone over the call sites of `os::vm_allocation_granularity()` and where it was being used to pad or sanity-check a size I have instead replaced it with `os::vm_page_size()`. The calls that genuinely deal with an attach address are left untouched. > > Testing: > - Oracle tiers 1-8 Thank you for doing this onerous work. I plan to look over this later when I find time; others might too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28493#issuecomment-3580614285 From eastigeevich at openjdk.org Wed Nov 26 10:23:07 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 26 Nov 2025 10:23:07 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v6] In-Reply-To: References: <85dBRXdwqMOffQvXGI9J_zhfLnwZ0LrY_Wj4w0nrpbM=.88de1041-c752-49aa-8ab2-600b92f8559d@github.com> Message-ID: On Wed, 26 Nov 2025 09:27:31 GMT, Erik ?sterlund wrote: >> I replaced the call of `ICache::invalidate_word()` with: >> >> asm volatile("dsb ish \n" >> "ic ivau, xzr \n" >> "isb \n" >> : : : "memory"); >> >> >> The code executed in `ICache::invalidate_word()` when all checks are done: >> >> dsb ish >> ic ivau >> dsb ish >> isb >> >> >> I use `xzr` in `ic ivau` because an address in it does not matter. The instruction is trapped and ignored. >> I think we don't need the second `dsb` because we will have `dsb sy` in the trap handler. > > I don't know if we want to jeapordize the correctness of the JVM code based on the exact instructions that are *currently* used to mitigate this issue in the kernel. Eliding the trailing dsb ish because we know the kernel mitigation runs it, seems unnecessarily fragile to me; if the kernel comes up with some smarter and cheaper way of mitigating this in the future, using some other magic incantation, then I don't want to have a correctness issue because of that implicit assumption. > > Is it noticeably expensive to run the trailing dsb again? Yes, we need dsb if we use ic, according to the Arm manual. They are redundant if we have hardware instruction cache coherence enable. On one side we know that the hardware icache coherence is working and ic is ignored. On another side, we check the hardware icache coherence is disabled and we should follow Arm ARM. I don't expect that having dsb has noticeable performance impact. I haven't seen any. I agree with prioritizing correctness. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2564406749 From eastigeevich at openjdk.org Wed Nov 26 10:48:14 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 26 Nov 2025 10:48:14 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v7] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with two additional commits since the last revision: - Fix code style - Correct ifdef; Add dsb after ic ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/42745e56..17456558 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=05-06 Stats: 11 lines in 6 files changed: 7 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From serb at openjdk.org Wed Nov 26 11:03:49 2025 From: serb at openjdk.org (Sergey Bylokhov) Date: Wed, 26 Nov 2025 11:03:49 GMT Subject: RFR: 8371893: [macOS] use dead_strip linker option to reduce binary size [v5] In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: <5xGU22WgiZiCJGUU3G3e1-SAty0aXnjzEbj3nx30y5c=.337ef8d2-e79e-477f-916c-a344079370a9@github.com> On Tue, 25 Nov 2025 13:12:06 GMT, Matthias Baesken wrote: >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : >> (before -> after setting the option) >> >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib >> 152K -> 132K images/jdk/lib/libjli.dylib >> 388K -> 296K images/jdk/lib/liblcms.dylib >> 164K -> 128K images/jdk/lib/libzip.dylib >> >> >> and libjvm : >> >> 20M -> 18M images/jdk/lib/server/libjvm.dylib >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Use dead_strip on macOS arrch64 AND x86_64 I'll run UI the tests for this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3580783766 From jsikstro at openjdk.org Wed Nov 26 11:02:10 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 26 Nov 2025 11:02:10 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v5] In-Reply-To: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: > Hello, > > Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. > > For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. > > However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. > > Testing: > * Oracle's tier1-4 > * tier1-3 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: Re-order methods for consistency in class hierarchy ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28394/files - new: https://git.openjdk.org/jdk/pull/28394/files/5662af37..6996c49e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28394&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28394&range=03-04 Stats: 106 lines in 5 files changed: 39 ins; 41 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/28394.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28394/head:pull/28394 PR: https://git.openjdk.org/jdk/pull/28394 From jsikstro at openjdk.org Wed Nov 26 11:08:51 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 26 Nov 2025 11:08:51 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v4] In-Reply-To: <3JxmiQCu88Zf7syywvwub2vq3AfE7Wk1AG0KRD1gV_4=.524bbaa4-1a87-4a9e-a877-d906b3767feb@github.com> References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> <3JxmiQCu88Zf7syywvwub2vq3AfE7Wk1AG0KRD1gV_4=.524bbaa4-1a87-4a9e-a877-d906b3767feb@github.com> Message-ID: On Wed, 26 Nov 2025 10:16:48 GMT, Thomas Stuefe wrote: >> Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: >> >> - Merge branch 'master' into JDK-8372150_parallel_minheapsize_numa_largepages >> - Choose large page size based on MaxHeapSize >> - Revert "8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages" >> >> This reverts commit c02e08ade597193d70d1eb21036845bdd0304d51. >> - Revert "Albert review feedback" >> >> This reverts commit 66928d22112c1ac516e4b654c28249fdedf0dba9. >> - Albert review feedback >> - 8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages > > This looks good to me (not that you need another review). A jtreg regression test would be nice, possibly in a separate RFE. > > What do other GCs do if heap is smaller than smallest large page size? > > The super-large page size issue one could also consider a user error that should result in a VM exit. Thank you for looking at this @tstuefe. In ZGC, we only support 2MB large pages, and we can commit any number of ZPages (regions), which are also aligned to 2MB so it always works out. I agree that super-large page sizes could be considered a configuration issue. We were a bit worried about when >=512M large page sizes are the default on the system. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28394#issuecomment-3580799409 From mbaesken at openjdk.org Wed Nov 26 11:20:52 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 26 Nov 2025 11:20:52 GMT Subject: RFR: 8371893: [macOS] use dead_strip linker option to reduce binary size [v5] In-Reply-To: <5xGU22WgiZiCJGUU3G3e1-SAty0aXnjzEbj3nx30y5c=.337ef8d2-e79e-477f-916c-a344079370a9@github.com> References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> <5xGU22WgiZiCJGUU3G3e1-SAty0aXnjzEbj3nx30y5c=.337ef8d2-e79e-477f-916c-a344079370a9@github.com> Message-ID: On Wed, 26 Nov 2025 11:01:22 GMT, Sergey Bylokhov wrote: > I'll run UI the tests for this. Great, thanks ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28319#issuecomment-3580838458 From jbhateja at openjdk.org Wed Nov 26 11:34:11 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 26 Nov 2025 11:34:11 GMT Subject: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v5] In-Reply-To: References: Message-ID: > Add a new Float16lVector type and corresponding concrete vector classes, in addition to existing primitive vector types, maintaining operation parity with the FloatVector type. > - Add necessary inline expander support. > - Enable intrinsification for a few vector operations, namely ADD/SUB/MUL/DIV/MAX/MIN/FMA. > - Use existing Float16 vector IR and backend support. > - Extended the existing VectorAPI JTREG test suite for the newly added Float16Vector operations. > > The idea here is to first be at par with Float16 auto-vectorization support before intrinsifying new operations (conversions, reduction, etc). > > The following are the performance numbers for some of the selected Float16Vector benchmarking kernels compared to equivalent auto-vectorized Float16OperationsBenchmark kernels. > > image > > Initial RFP[1] was floated on the panama-dev mailing list. > > Kindly review the draft PR and share your feedback. > > Best Regards, > Jatin > > [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Cleanups ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28002/files - new: https://git.openjdk.org/jdk/pull/28002/files/aca6cc5d..756a0d0c Webrevs: - full: Webrev is not available because diff is too large - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28002&range=03-04 Stats: 26 lines in 9 files changed: 5 ins; 7 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/28002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28002/head:pull/28002 PR: https://git.openjdk.org/jdk/pull/28002 From stuefe at openjdk.org Wed Nov 26 11:47:50 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 26 Nov 2025 11:47:50 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v4] In-Reply-To: <3JxmiQCu88Zf7syywvwub2vq3AfE7Wk1AG0KRD1gV_4=.524bbaa4-1a87-4a9e-a877-d906b3767feb@github.com> References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> <3JxmiQCu88Zf7syywvwub2vq3AfE7Wk1AG0KRD1gV_4=.524bbaa4-1a87-4a9e-a877-d906b3767feb@github.com> Message-ID: On Wed, 26 Nov 2025 10:16:48 GMT, Thomas Stuefe wrote: >> Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: >> >> - Merge branch 'master' into JDK-8372150_parallel_minheapsize_numa_largepages >> - Choose large page size based on MaxHeapSize >> - Revert "8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages" >> >> This reverts commit c02e08ade597193d70d1eb21036845bdd0304d51. >> - Revert "Albert review feedback" >> >> This reverts commit 66928d22112c1ac516e4b654c28249fdedf0dba9. >> - Albert review feedback >> - 8372150: Parallel: Tighten requirements around MinHeapSize with NUMA and Large Pages > > This looks good to me (not that you need another review). A jtreg regression test would be nice, possibly in a separate RFE. > > What do other GCs do if heap is smaller than smallest large page size? > > The super-large page size issue one could also consider a user error that should result in a VM exit. > Thank you for looking at this @tstuefe. In ZGC, we only support 2MB large pages, and we can commit any number of ZPages (regions), which are also aligned to 2MB so it always works out. > > I agree that super-large page sizes could be considered a configuration issue. We were a bit worried about when >=512M large page sizes are the default on the system. Yeah, and I gree its better not to shake that boat. The person setting up the system may not be talking to the one determining the VM options. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28394#issuecomment-3580937173 From kbarrett at openjdk.org Wed Nov 26 11:48:54 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 26 Nov 2025 11:48:54 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v3] In-Reply-To: <2eO9uSwDm-i18M0ixBHICBJmA2jHZaDRj6kzXwg6IfQ=.c54a4bb8-a8ad-40d1-baf7-c07b22565603@github.com> References: <2eO9uSwDm-i18M0ixBHICBJmA2jHZaDRj6kzXwg6IfQ=.c54a4bb8-a8ad-40d1-baf7-c07b22565603@github.com> Message-ID: On Wed, 26 Nov 2025 10:20:14 GMT, Axel Boldt-Christmas wrote: >> AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. >> This restriction added a lot of extra logic to the Atomic implementation because >> we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. >> >> I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. >> >> This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. >> >> _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ >> >> * Testing >> * Extended gtest / (no other users of Atomic byte with exchange exists. >> * GHA >> * Running Tier 1-5 on Oracle supported platforms > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Remove VM_Version::supports_cx8() conditions > - Add AtomicAccessXchgTest for 1 byte Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28498#pullrequestreview-3510529915 From kevinw at openjdk.org Wed Nov 26 12:04:49 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 26 Nov 2025 12:04:49 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: On Tue, 25 Nov 2025 12:22:21 GMT, Alan Bateman wrote: >> src/jdk.management/share/classes/com/sun/management/internal/PlatformMBeanProviderImpl.java line 192: >> >>> 190: HotSpotAOTCacheMXBean impl = this.impl; >>> 191: if (impl == null) { >>> 192: this.impl = impl = new HotSpotAOTCacheImpl(ManagementFactoryHelper.getVMManagement()); >> >> This assignment is unusual. Are we trying to avoid a synchronized block? Other nameToMBeanMap() methods are like: >> return Collections.singletonMap(ManagementFactory.MEMORY_MXBEAN_NAME, ManagementFactoryHelper.getMemoryMXBean()); >> >> ..where the ManagementFactoryHelper.getMemoryMXBean() method is synchronized and creates the impl if needed. > > I don't see a correctly issue with this. Maybe in the future we will be able to use LazyConstant here. Sure, I'm just pointing out that we have a load of existing nameToMBeanMap() methods that do things differently. OK I now see this one is doing what the new VirtualThreadSchedulerMXBean did. The others are different: commonly the nameToMBeanMap() methods in PlatformMBeanProviderImpl.java are synchronized, or they call a getXXMXBean() method which is synchronized. Maybe these old methods don't need to be synchronized, if this all gets done at startup in PlatformMBeanProviderImpl init(), the mbeans will always be created once. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2564720463 From mgronlun at openjdk.org Wed Nov 26 12:18:31 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 26 Nov 2025 12:18:31 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading Message-ID: Greetings, this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. Testing: jdk_jfr, manual AOT verification, stress testing Thanks Markus ------------- Commit messages: - symboltable_and_location Changes: https://git.openjdk.org/jdk/pull/28505/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365400 Stats: 1360 lines in 34 files changed: 1034 ins; 162 del; 164 mod Patch: https://git.openjdk.org/jdk/pull/28505.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28505/head:pull/28505 PR: https://git.openjdk.org/jdk/pull/28505 From alanb at openjdk.org Wed Nov 26 12:25:53 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 26 Nov 2025 12:25:53 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 12:01:40 GMT, Kevin Walls wrote: >> I don't see a correctness issue with this. Maybe in the future we will be able to use LazyConstant here. > > Sure, I'm just pointing out that we have a load of existing nameToMBeanMap() methods that do things differently. > OK I now see this one is doing what the new VirtualThreadSchedulerMXBean did. > > The others are different: commonly the nameToMBeanMap() methods in PlatformMBeanProviderImpl.java are synchronized, or they call a getXXMXBean() method which is synchronized. > > Maybe these old methods don't need to be synchronized, if this all gets done at startup in PlatformMBeanProviderImpl init(), the mbeans will always be created once. The older code pre-dates unmodifiable maps (JEP 269), it could be modernized some time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2564793415 From lucy at openjdk.org Wed Nov 26 12:42:56 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 26 Nov 2025 12:42:56 GMT Subject: RFR: 8371893: [macOS] use dead_strip linker option to reduce binary size [v5] In-Reply-To: References: <5GKe55IFkJgBmku-Y-X4mcoy0V43x-ZXBzwb7EbwqTU=.23fe4819-b951-4364-a7b5-315e2b6d5824@github.com> Message-ID: On Tue, 25 Nov 2025 13:12:06 GMT, Matthias Baesken wrote: >> The dead_strip linker option on macOS removes functions and data that are unreachable by the entry point or exported symbols. >> Setting it can reduce the size of some binaries we generate quite a lot, for example (product build, Xcode 15 is used) : >> (before -> after setting the option) >> >> 1.4M -> 1.1M images/jdk/lib/libfontmanager.dylib >> 264K -> 248K images/jdk/lib/libjavajpeg.dylib >> 152K -> 132K images/jdk/lib/libjli.dylib >> 388K -> 296K images/jdk/lib/liblcms.dylib >> 164K -> 128K images/jdk/lib/libzip.dylib >> >> >> and libjvm : >> >> 20M -> 18M images/jdk/lib/server/libjvm.dylib >> 146M -> 137M images/jdk/lib/server/libjvm.dylib.dSYM > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Use dead_strip on macOS arrch64 AND x86_64 Looks good, finally. Lots of discussion for such a tiny change... :-) ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28319#pullrequestreview-3510772202 From ayang at openjdk.org Wed Nov 26 12:45:58 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 26 Nov 2025 12:45:58 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v5] In-Reply-To: References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: <2CiQdUDN8DPOaM26i-IUY3-TNNslGwEKu5SVLNkh6RI=.c4933be6-c563-4f26-9cc7-3b049419cec6@github.com> On Wed, 26 Nov 2025 11:02:10 GMT, Joel Sikstr?m wrote: >> Hello, >> >> Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. >> >> For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. >> >> However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. >> >> Testing: >> * Oracle's tier1-4 >> * tier1-3 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Re-order methods for consistency in class hierarchy Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28394#pullrequestreview-3510786811 From mbaesken at openjdk.org Wed Nov 26 12:53:03 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 26 Nov 2025 12:53:03 GMT Subject: Integrated: 8371626: [linux] use icf=all for linking libraries In-Reply-To: References: Message-ID: On Tue, 11 Nov 2025 14:27:53 GMT, Matthias Baesken wrote: > Identical code folding can reduce the size of some libs, especially libjvm. However not all linkers support the flag/feature so we have to limit it to some environments. > See for example this manpage : > https://manpages.debian.org/testing/lld-7/ld.lld-7.1 > > > sizes of libjvm.so with / without -icf=all > linux aarch64 : 25888 / 27112 K > linux x86_64 : 27952 / 29072 K > > > (for most other native libs the identical code folding has no effect, because there is nothing to fold) This pull request has now been integrated. Changeset: 4ae2f31f Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/4ae2f31f3d2027daa0a5ccba6180e7bb27413aa5 Stats: 8 lines in 2 files changed: 8 ins; 0 del; 0 mod 8371626: [linux] use icf=all for linking libraries Reviewed-by: goetz, erikj ------------- PR: https://git.openjdk.org/jdk/pull/28236 From mbaesken at openjdk.org Wed Nov 26 12:53:01 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 26 Nov 2025 12:53:01 GMT Subject: RFR: 8371626: [linux] use icf=all for linking libraries [v3] In-Reply-To: References: Message-ID: On Wed, 12 Nov 2025 15:46:09 GMT, Matthias Baesken wrote: >> Identical code folding can reduce the size of some libs, especially libjvm. However not all linkers support the flag/feature so we have to limit it to some environments. >> See for example this manpage : >> https://manpages.debian.org/testing/lld-7/ld.lld-7.1 >> >> >> sizes of libjvm.so with / without -icf=all >> linux aarch64 : 25888 / 27112 K >> linux x86_64 : 27952 / 29072 K >> >> >> (for most other native libs the identical code folding has no effect, because there is nothing to fold) > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Limit icf to release builds Thanks for the reviews ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28236#issuecomment-3581179600 From kevinw at openjdk.org Wed Nov 26 12:54:50 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 26 Nov 2025 12:54:50 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v8] In-Reply-To: References: Message-ID: <9nPLu2Fgp-9WizodN4foJP6NA0IvpQGyghqUI-LlzYI=.6560feb6-3c80-4e87-9e85-8c66de78df02@github.com> On Wed, 26 Nov 2025 12:23:26 GMT, Alan Bateman wrote: >> Sure, I'm just pointing out that we have a load of existing nameToMBeanMap() methods that do things differently. >> OK I now see this one is doing what the new VirtualThreadSchedulerMXBean did. >> >> The others are different: commonly the nameToMBeanMap() methods in PlatformMBeanProviderImpl.java are synchronized, or they call a getXXMXBean() method which is synchronized. >> >> Maybe these old methods don't need to be synchronized, if this all gets done at startup in PlatformMBeanProviderImpl init(), the mbeans will always be created once. > > The older code pre-dates unmodifiable maps (JEP 269), it could be modernized some time. Thanks yes would be good to do that some time. It looks like an effort with the older accesors to enforce that they are singletons. That may not be important for all the mxbeans. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28010#discussion_r2564893489 From stefank at openjdk.org Wed Nov 26 13:10:26 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 26 Nov 2025 13:10:26 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v5] In-Reply-To: References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: On Wed, 26 Nov 2025 11:02:10 GMT, Joel Sikstr?m wrote: >> Hello, >> >> Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. >> >> For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. >> >> However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. >> >> Testing: >> * Oracle's tier1-4 >> * tier1-3 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Re-order methods for consistency in class hierarchy Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28394#pullrequestreview-3510885908 From shade at openjdk.org Wed Nov 26 13:49:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Nov 2025 13:49:29 GMT Subject: RFR: 8357258: x86: Improve receiver type profiling reliability [v4] In-Reply-To: References: Message-ID: > See the bug for discussion what issues current machinery has. > > This PR executes the plan outlined in the bug: > 1. Common the receiver type profiling code in interpreter and C1 > 2. Rewrite receiver type profiling code to only do atomic receiver slot installations > 3. Trim `C1OptimizeVirtualCallProfiling` to only claim slots when receiver is installed > > This PR does _not_ do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler/` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Grossly simplify register shuffling - More asserts - More comment touchups - Inline code comments - Mention the updater in ReceiverTypeData - type_profile -> profile_receiver_type - Stylistic: remove redundant assert - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls - ... and 2 more: https://git.openjdk.org/jdk/compare/5291e1c1...33e4edb1 ------------- Changes: https://git.openjdk.org/jdk/pull/25305/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25305&range=03 Stats: 381 lines in 8 files changed: 165 ins; 197 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/25305.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25305/head:pull/25305 PR: https://git.openjdk.org/jdk/pull/25305 From shade at openjdk.org Wed Nov 26 13:49:31 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Nov 2025 13:49:31 GMT Subject: RFR: 8357258: x86: Improve receiver type profiling reliability [v3] In-Reply-To: References: Message-ID: On Wed, 24 Sep 2025 13:08:14 GMT, Aleksey Shipilev wrote: >> See the bug for discussion what issues current machinery has. >> >> This PR executes the plan outlined in the bug: >> 1. Common the receiver type profiling code in interpreter and C1 >> 2. Rewrite receiver type profiling code to only do atomic receiver slot installations >> 3. Trim `C1OptimizeVirtualCallProfiling` to only claim slots when receiver is installed >> >> This PR does _not_ do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler/` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls > - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls > - Drop atomic counters > - Initial version When looking at this PR again, I realized shuffling could be much simpler if we do it outside the loop. I am testing new revision now and would do a few touchups. I'll say when the patch is ready for more thorough look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25305#issuecomment-3581395413 From shade at openjdk.org Wed Nov 26 13:49:33 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Nov 2025 13:49:33 GMT Subject: RFR: 8357258: x86: Improve receiver type profiling reliability [v3] In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 17:10:33 GMT, John R Rose wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls >> - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls >> - Drop atomic counters >> - Initial version > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4845: > >> 4843: push(temp_reg); >> 4844: movptr(temp_reg, recv); >> 4845: recv_reg = temp_reg; > > I can mentally do the appropriate `assert_different_registers` here, but an explicit one to confirm would be better. > (Same comment for the next arm of the if/else.) https://github.com/openjdk/jdk/pull/25305#issuecomment-3581395413 :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25305#discussion_r2565073504 From chagedorn at openjdk.org Wed Nov 26 14:31:58 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 26 Nov 2025 14:31:58 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v4] In-Reply-To: <6qShqR-Ohv7vamoJ_B4Ev-poU8SB96eTBo4HFJrylcI=.dac5a26f-c9f0-445b-8f1c-a7c719fa27ae@github.com> References: <6qShqR-Ohv7vamoJ_B4Ev-poU8SB96eTBo4HFJrylcI=.dac5a26f-c9f0-445b-8f1c-a7c719fa27ae@github.com> Message-ID: <4QQp7C7iIVfVs1MoUMC56KCgVGpXu5ziTHfZ-f2pk6o=.4ca7e1a8-3f31-44d3-aaec-30429ed7e2b0@github.com> On Tue, 25 Nov 2025 12:52:35 GMT, Roland Westrelin wrote: >> This is a variant of 8332827. In 8332827, an array access becomes >> dependent on a range check `CastII` for another array access. When, >> after loop opts are over, that RC `CastII` was removed, the array >> access could float and an out of bound access happened. With the fix >> for 8332827, RC `CastII`s are no longer removed. >> >> With this one what happens is that some transformations applied after >> loop opts are over widen the type of the RC `CastII`. As a result, the >> type of the RC `CastII` is no longer narrower than that of its input, >> the `CastII` is removed and the dependency is lost. >> >> There are 2 transformations that cause this to happen: >> >> - after loop opts are over, the type of the `CastII` nodes are widen >> so nodes that have the same inputs but a slightly different type can >> common. >> >> - When pushing a `CastII` through an `Add`, if of the type both inputs >> of the `Add`s are non constant, then we end up widening the type >> (the resulting `Add` has a type that's wider than that of the >> initial `CastII`). >> >> There are already 3 types of `Cast` nodes depending on the >> optimizations that are allowed. Either the `Cast` is floating >> (`depends_only_test()` returns `true`) or pinned. Either the `Cast` >> can be removed if it no longer narrows the type of its input or >> not. We already have variants of the `CastII`: >> >> - if the Cast can float and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and be removed when it doesn't narrow the type >> of its input. >> >> - if the Cast is pinned and can't be removed when it doesn't narrow >> the type of its input. >> >> What we need here, I think, is the 4th combination: >> >> - if the Cast can float and can't be removed when it doesn't narrow >> the type of its input. >> >> Anyway, things are becoming confusing with all these different >> variants named in ways that don't always help figure out what >> constraints one of them operate under. So I refactored this and that's >> the biggest part of this change. The fix consists in marking `Cast` >> nodes when their type is widen in a way that prevents them from being >> optimized out. >> >> Tobias ran performance testing with a slightly different version of >> this change and there was no regression. > > Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - review > - review > - Merge branch 'master' into JDK-8354282 > - review > - infinite loop in gvn fix > - renaming > - merge > - Merge branch 'master' into JDK-8354282 > - fix & test Introducing a 4th dependency type looks reasonable. It's also nice to see one more refactoring in that area which makes it very expressive now. Thanks for doing that! I left some suggestions to possibly further improve the code. src/hotspot/share/opto/castnode.cpp line 40: > 38: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::FloatingNonNarrowing(true, false, "floating non narrowing dependency"); // not pinned, doesn't narrow type > 39: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNarrowing(false, true, "now floating narrowing dependency"); // pinned, narrows type > 40: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNonNarrowing(false, false, "non floating non narrowing dependency"); // pinned, doesn't narrow type Adding `-`: Suggestion: const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::FloatingNonNarrowing(true, false, "floating non-narrowing dependency"); // not pinned, doesn't narrow type const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNarrowing(false, true, "non-floating narrowing dependency"); // pinned, narrows type const ConstraintCastNode::DependencyType ConstraintCastNode::DependencyType::NonFloatingNonNarrowing(false, false, "non-floating non-narrowing dependency"); // pinned, doesn't narrow type src/hotspot/share/opto/castnode.cpp line 50: > 48: if (!_dependency.narrows_type()) { > 49: return this; > 50: } I suggest to split the comment to make it more clear: Suggestion: if (!_dependency.narrows_type()) { // If this cast doesn't carry a type dependency (i.e. not used for type narrowing), we cannot optimize it. return this; } // This cast node carries a type depedency. We can remove it if: // - Its input has a narrower type // - There's a dominating cast with same input but narrower type src/hotspot/share/opto/castnode.cpp line 634: > 632: if (wide_t != bottom_t) { > 633: // Widening the type of the Cast (to allow some commoning) causes the Cast to change how it can be optimized (if > 634: // type of its input is narrower than the Cast's type, we can't remove it to not loose the dependency). Suggestion: // type of its input is narrower than the Cast's type, we can't remove it to not loose the control dependency). src/hotspot/share/opto/castnode.hpp line 101: > 99: } > 100: return NonFloatingNonNarrowing; > 101: } Just a side note: We seem to mix the terms "(non-)pinned" with "(non-)floating" freely. Should we stick to just one? But maybe it's justified to use both depending on the situation/code context. src/hotspot/share/opto/castnode.hpp line 120: > 118: // be removed in any case otherwise the sunk node floats back into the loop. > 119: static const DependencyType NonFloatingNonNarrowing; > 120: I needed a moment to completely understand all these combinations. I rewrote the definitions in this process a little bit. Feel free to take some of it over: // All the possible combinations of floating/narrowing with example use cases: // Use case example: Range Check CastII // Floating: The Cast is only dependent on the single range check. // Narrowing: The Cast narrows the type to a positive index. If the input to the Cast is narrower, we can safely // remove the cast because the array access will be safe. static const DependencyType FloatingNarrowing; // Use case example: Widening Cast nodes' types after loop opts: We want to common Casts with slightly different types. // Floating: These Casts only depend on the single control. // NonNarrowing: Even when the input type is narrower, we are not removing the Cast. Otherwise, the dependency // to the single control is lost, and an array access could float above its range check because we // just removed the dependency to the range check by removing the Cast. This could lead to an // out-of-bounds access. static const DependencyType FloatingNonNarrowing; // Use case example: An array accesses that is no longer dependent on a single range check (e.g. range check smearing). // NonFloating: The array access must be pinned below all the checks it depends on. If the check it directly depends // on with a control input is hoisted, we do hoist the Cast as well. If we allowed the Cast to float, // we risk that the array access ends up above another check it depends on (we cannot model two control // dependencies for a node in the IR). This could lead to an out-of-bounds access. // Narrowing: If the Cast does not narrow the input type, then it's safe to remove the cast because the array access // will be safe. static const DependencyType NonFloatingNarrowing; // Use case example: Sinking nodes out of a loop // Non-Floating & Non-Narrowing: We don't want the Cast that forces the node to be out of loop to be removed in any // case. Otherwise, the sunk node could float back into the loop, undoing the sinking. // This Cast is only used for pinning without caring about narrowing types. static const DependencyType NonFloatingNonNarrowing; test/hotspot/jtreg/compiler/c2/irTests/TestPushAddThruCast.java line 100: > 98: @Run(test = "test3") > 99: public static void test3_runner() { > 100: i = RANDOM.nextInt(3, length-1); Suggestion: i = RANDOM.nextInt(3, length - 1); ------------- PR Review: https://git.openjdk.org/jdk/pull/24575#pullrequestreview-3510584501 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565071692 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565111822 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565208320 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565130012 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565000528 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2565211189 From fthevene at redhat.com Wed Nov 26 14:53:51 2025 From: fthevene at redhat.com (Frederic Thevenet) Date: Wed, 26 Nov 2025 15:53:51 +0100 Subject: Any reason why +PrintFlagsFinal requires unlocking experimental and diagnostic flags to print their default values? Message-ID: <849ebf93-c3fb-45c9-92c8-21a49b3e9946@redhat.com> Hi, Currently, using +PrintFlagsFinal prints out all JVM flags and their values, even if they were not modified from their default, except for 'locked' flags, i.e. Experimental and Diagnotic flags. In order to have those printed out as well, one must first 'unlock' them (with +UnlockExperimentalVMOptions, for instance). Now, is their a strong reason for not always displaying the default values for those in scenarios were there is no concerns that the output might be too large (that is when calling upon 'JVMFlag::printFlags' with 'skipDefaults' set to false, like PrintFlagsFinal does)? The reason for this question is that when chasing a bug in scenarios where one can only rely on logs or output provided by tools that uses +PrintFlagsFinal, getting the default values *in the conditions that those logs where produced* can be tricky as it depends on the exact version of the JDK that was running, and some values can be changed by ergonomics. If you need to know the default for experimental flags -- which given their nature can and do change often -- your choices are to either ask for these logs to be generated again using +UnlockExperimentalVMOptions (even if there is no intention of changing an experimental flag) or to go on a time consuming deep dive into the code base for the exact version of the JDK that was used. Neither is ideal. Please note that in cases where unchanged flags are not printed out (like in an hs_err report), no longer requiring? to unlock all flags to print them out would have? no side effect, i.e. it would not increase the amount of noise in the report (as in this case only modified flags are printed out in, and for experimental flags this can only happen if? '+UnlockExperimentalVMOptions' is set to begin with). Regards, -- Frederic Thevenet Senior Software Engineer - OpenJDK Red Hat France BAF5 C2D2 0BE0 1715 5EE1 0815 2065 AD47 B326 EB92 From shade at openjdk.org Wed Nov 26 15:55:38 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Nov 2025 15:55:38 GMT Subject: RFR: 8357258: x86: Improve receiver type profiling reliability [v5] In-Reply-To: References: Message-ID: > See the bug for discussion what issues current machinery has. > > This PR executes the plan outlined in the bug: > 1. Common the receiver type profiling code in interpreter and C1 > 2. Rewrite receiver type profiling code to only do atomic receiver slot installations > 3. Trim `C1OptimizeVirtualCallProfiling` to only claim slots when receiver is installed > > This PR does _not_ do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler/` > - [x] Linux x86_64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls - Tighten up some more - Offset is always rscratch1, no need to save it - Grossly simplify register shuffling - More asserts - More comment touchups - Inline code comments - Mention the updater in ReceiverTypeData - type_profile -> profile_receiver_type - Stylistic: remove redundant assert - ... and 5 more: https://git.openjdk.org/jdk/compare/c028369d...c441209a ------------- Changes: https://git.openjdk.org/jdk/pull/25305/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25305&range=04 Stats: 383 lines in 8 files changed: 167 ins; 197 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/25305.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25305/head:pull/25305 PR: https://git.openjdk.org/jdk/pull/25305 From mablakatov at openjdk.org Wed Nov 26 16:13:16 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 26 Nov 2025 16:13:16 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Mon, 10 Nov 2025 22:02:06 GMT, Evgeny Astigeevich wrote: >> test/hotspot/jtreg/compiler/sharedstubs/SharedRuntimeCallTrampolineTest.java line 87: >> >>> 85: >>> 86: private static void checkOutput(OutputAnalyzer output) { >>> 87: String testMethodStdout = getTestMethodStdout(output); >> >> Can you add a description what output format is expected? Adding an example will help a lot. > > The test expects runtime calls. What will result them to appear? I've added some comments and extended the regular expressions so it's hopefully more clear what lines the code expects. Please see https://github.com/openjdk/jdk/pull/25954/commits/ac36641d980c057fb7060e0edd782514a926601f. Runtime calls are emitted for the two `new` expressions in `RuntimeCallTest.test(int)`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2565611140 From mablakatov at openjdk.org Wed Nov 26 16:13:23 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 26 Nov 2025 16:13:23 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: <6tixd47-R7Swo7i7pBpj5ds7zywOFd1SmchD6CZ91ug=.9170a74c-b7c2-453f-9878-af7bc63f1a7f@github.com> On Mon, 10 Nov 2025 21:58:21 GMT, Evgeny Astigeevich wrote: >> Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: >> >> - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 >> - the only trampoline in ArrayCopyStub is never shared >> - fixup: a shared trampoline must branch to a statically bound method >> - share static call trampolines generated by C1 as well >> - assert callee is nullptr for runtime calls >> - assert that call sites offsets aren't missing >> - cleanup: rephrase comments in macroAssembler_aarch64.hpp >> - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 >> - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' >> - remove implementation-dependent logic from emit_shared_trampolines() >> - ... and 8 more: https://git.openjdk.org/jdk/compare/f6f87bb6...871903f4 > > test/hotspot/jtreg/compiler/sharedstubs/SharedRuntimeCallTrampolineTest.java line 107: > >> 105: .map(reloc -> new String(reloc.addr())) >> 106: .collect(Collectors.toList()); >> 107: if (trampolineAddrs.stream().distinct().count() >= trampolineAddrs.size()) { > > For better readability, could you please create a meaningful variable for `trampolineAddrs.stream().distinct().count()`? You can reuse it in the exception message as well. Done, please see https://github.com/openjdk/jdk/pull/25954/commits/ac36641d980c057fb7060e0edd782514a926601f > test/hotspot/jtreg/compiler/sharedstubs/SharedStaticCallTrampolineTest.java line 53: > >> 51: import jdk.test.lib.process.ProcessTools; >> 52: >> 53: public class SharedStaticCallTrampolineTest { > > Similar comments as to `SharedRuntimeCallTrampolineTest.java` Done, please see https://github.com/openjdk/jdk/commit/ac36641d980c057fb7060e0edd782514a926601f. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2565611348 PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2565612724 From roland at openjdk.org Wed Nov 26 16:14:43 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 26 Nov 2025 16:14:43 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v5] In-Reply-To: References: Message-ID: > This is a variant of 8332827. In 8332827, an array access becomes > dependent on a range check `CastII` for another array access. When, > after loop opts are over, that RC `CastII` was removed, the array > access could float and an out of bound access happened. With the fix > for 8332827, RC `CastII`s are no longer removed. > > With this one what happens is that some transformations applied after > loop opts are over widen the type of the RC `CastII`. As a result, the > type of the RC `CastII` is no longer narrower than that of its input, > the `CastII` is removed and the dependency is lost. > > There are 2 transformations that cause this to happen: > > - after loop opts are over, the type of the `CastII` nodes are widen > so nodes that have the same inputs but a slightly different type can > common. > > - When pushing a `CastII` through an `Add`, if of the type both inputs > of the `Add`s are non constant, then we end up widening the type > (the resulting `Add` has a type that's wider than that of the > initial `CastII`). > > There are already 3 types of `Cast` nodes depending on the > optimizations that are allowed. Either the `Cast` is floating > (`depends_only_test()` returns `true`) or pinned. Either the `Cast` > can be removed if it no longer narrows the type of its input or > not. We already have variants of the `CastII`: > > - if the Cast can float and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and can't be removed when it doesn't narrow > the type of its input. > > What we need here, I think, is the 4th combination: > > - if the Cast can float and can't be removed when it doesn't narrow > the type of its input. > > Anyway, things are becoming confusing with all these different > variants named in ways that don't always help figure out what > constraints one of them operate under. So I refactored this and that's > the biggest part of this change. The fix consists in marking `Cast` > nodes when their type is widen in a way that prevents them from being > optimized out. > > Tobias ran performance testing with a slightly different version of > this change and there was no regression. Roland Westrelin has updated the pull request incrementally with four additional commits since the last revision: - Update src/hotspot/share/opto/castnode.cpp Co-authored-by: Christian Hagedorn - Update src/hotspot/share/opto/castnode.cpp Co-authored-by: Christian Hagedorn - Update src/hotspot/share/opto/castnode.cpp Co-authored-by: Christian Hagedorn - Update test/hotspot/jtreg/compiler/c2/irTests/TestPushAddThruCast.java Co-authored-by: Christian Hagedorn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24575/files - new: https://git.openjdk.org/jdk/pull/24575/files/3569280e..2aa918e2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=03-04 Stats: 13 lines in 2 files changed: 5 ins; 3 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/24575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24575/head:pull/24575 PR: https://git.openjdk.org/jdk/pull/24575 From mablakatov at openjdk.org Wed Nov 26 16:25:27 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 26 Nov 2025 16:25:27 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method [v8] In-Reply-To: <52JasB76bWD5S9vvAGsjitHHblK0jBqPNGnHr_x1lmM=.2940f5b7-ae97-4809-a4ea-3b4a64df961f@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> <52JasB76bWD5S9vvAGsjitHHblK0jBqPNGnHr_x1lmM=.2940f5b7-ae97-4809-a4ea-3b4a64df961f@github.com> Message-ID: On Mon, 10 Nov 2025 22:19:19 GMT, Evgeny Astigeevich wrote: >> Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: >> >> - Merge commit 'f6f87bb6759c86d941453a1776e8abfdffc48183' into 8359359 >> - the only trampoline in ArrayCopyStub is never shared >> - fixup: a shared trampoline must branch to a statically bound method >> - share static call trampolines generated by C1 as well >> - assert callee is nullptr for runtime calls >> - assert that call sites offsets aren't missing >> - cleanup: rephrase comments in macroAssembler_aarch64.hpp >> - Merge commit 'fd29677479797956e0d205b5ce6e7cb9ad407bd1' into 8359359 >> - Merge commit '41520998aa8808452ee384b213b2a77c7bad668d' >> - remove implementation-dependent logic from emit_shared_trampolines() >> - ... and 8 more: https://git.openjdk.org/jdk/compare/f6f87bb6...871903f4 > > test/hotspot/jtreg/compiler/sharedstubs/SharedStaticCallTrampolineTest.java line 121: > >> 119: .filter(addr -> Collections.frequency(trampolineAddrs, addr) == 1) >> 120: .collect(Collectors.toList()); >> 121: if (uniqueTrampolineAddrs.size() == 0) { > > Should we expect this to be 1? Possible values: 0, 1 or 3? > 0 - incorrect mapping of a call site > 1 - everything is correct > 3 - sharing does not work Yes, the code expects this to be 1. I'd argue that testing for 3 here would duplicate the check already performed above ([line 119](https://github.com/openjdk/jdk/pull/25954/files#diff-96cb53c21e0d1344a4359e395671d20c5c4dc8493a87c7716936f35258f939cfR119)). Currently, `checkOutput(OutputAnalyzer)` explicitly checks for two kinds of incorrect behaviors: 1. No trampoline stubs shared across static calls 2. A trampoline stub is shared while it should not be If neither of these issues is detected, the code implicitly treats the behavior as correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2565648787 From macarte at openjdk.org Wed Nov 26 16:26:04 2025 From: macarte at openjdk.org (Mat Carter) Date: Wed, 26 Nov 2025 16:26:04 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v9] In-Reply-To: References: Message-ID: > Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. > > The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE > > It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: > > TRUE > FALSE > > Passes tier1 on linux (x64) and windows (x64) Mat Carter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Merge branch 'master' into JDK-8369736 - Remove single whitespace - Incorporate changes from the CSR - Revert "Adding test to validate using DiagnosticCommand MBean to invoke AOT.end_recording" Commit was intended for parent branch (that this branch is based on) This reverts commit bff7cb7408554232c13a57bba10b67a9fd19b811. - Adding test to validate using DiagnosticCommand MBean to invoke AOT.end_recording - Updated test based on comments - Merge branch 'JDK-8369736' of https://github.com/macarte/jdk into JDK-8369736 - Update src/jdk.management/share/classes/jdk/management/HotSpotAOTCacheMXBean.java Co-authored-by: Dan Heidinga - Update src/jdk.management/share/classes/jdk/management/HotSpotAOTCacheMXBean.java Co-authored-by: Dan Heidinga - Wording and format changes - ... and 5 more: https://git.openjdk.org/jdk/compare/c028369d...a12bfa03 ------------- Changes: https://git.openjdk.org/jdk/pull/28010/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=08 Stats: 433 lines in 11 files changed: 338 ins; 0 del; 95 mod Patch: https://git.openjdk.org/jdk/pull/28010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28010/head:pull/28010 PR: https://git.openjdk.org/jdk/pull/28010 From vpaprotski at openjdk.org Wed Nov 26 16:47:23 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Wed, 26 Nov 2025 16:47:23 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v7] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 00:55:18 GMT, David Holmes wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> comments from Jatin > > The new test can only run on x86 but it is not restricted to x86, thus it fails when run on Aarch64. @dholmes-ora Hi David, need some help with this please, don't have access to an ARM system to reproduce (or the ARM expertise).. could you point me at the failing job if thats available? Or some log if not? - Is it an issue with the options (i.e. `-XX:UseAVX=2` perhaps). I probably should had added `-XX:+IgnoreUnrecognizedVMOptions` to it.. - Otherwise, I am stumped.. the test case isn't architecture-specific.. it calls two methods (one of which is annotated as an intrinsic..) and expects them to return the same value.. i.e. Java and Intrinsic version should behave the same.. - Only thing I can think of.. The ARM implementation took some shortcuts in name of optimization. This can be entirely valid if the code calling the intrinsics never should get some specific value (-ranges). i.e. the tests RNG be further restricted.. - Otherwise.. is it possible its a bug in the ARM intrinsic? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3582205814 From vpaprotski at openjdk.org Wed Nov 26 16:52:03 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Wed, 26 Nov 2025 16:52:03 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 21:00:39 GMT, Anthony Scarpino wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> next set of comments > > Marked as reviewed by ascarpino (Reviewer). Oh.. realized that I should had checked JBS.. thanks @ascarpino for resolving the bug I caused! At least its just the option.. whew. > @dholmes-ora Hi David, need some help with this please, don't have access to an ARM system to reproduce (or the ARM expertise).. could you point me at the failing job if thats available? Or some log if not? > > * Is it an issue with the options (i.e. `-XX:UseAVX=2` perhaps). I probably should had added `-XX:+IgnoreUnrecognizedVMOptions` to it.. > * Otherwise, I am stumped.. the test case isn't architecture-specific.. it calls two methods (one of which is annotated as an intrinsic..) and expects them to return the same value.. i.e. Java and Intrinsic version should behave the same.. > * Only thing I can think of.. The ARM implementation took some shortcuts in name of optimization. This can be entirely valid if the code calling the intrinsics never should get some specific value (-ranges). i.e. the tests RNG be further restricted.. > * Otherwise.. is it possible its a bug in the ARM intrinsic? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3582226267 From eastigeevich at openjdk.org Wed Nov 26 17:00:22 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 26 Nov 2025 17:00:22 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v8] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Fix regressions for Java methods without field accesses ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/17456558..d36be373 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=06-07 Stats: 43 lines in 9 files changed: 27 ins; 1 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From eastigeevich at openjdk.org Wed Nov 26 17:04:52 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 26 Nov 2025 17:04:52 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v6] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 09:04:19 GMT, Axel Boldt-Christmas wrote: >> Evgeny Astigeevich has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove redundant include >> - Move ICacheInvalidationContext::pd_ to icache_linux_aarch64 > > Some style comments. @xmas92 I fixed regressions for Java methods without field accesses I saw: - `-XX:+NeoverseN1Errata1542419` before the fix Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 88.865 ? 19.299 ms/op GCPatchingNmethodCost.systemGC 0 5000 avgt 3 90.572 ? 14.750 ms/op GCPatchingNmethodCost.youngGC 0 5000 avgt 3 10.219 ? 0.877 ms/op - `-XX:+NeoverseN1Errata1542419` after the fix Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 60.847 ? 23.735 ms/op GCPatchingNmethodCost.systemGC 0 5000 avgt 3 62.338 ? 5.663 ms/op GCPatchingNmethodCost.youngGC 0 5000 avgt 3 4.956 ? 1.440 ms/op - `-XX:-NeoverseN1Errata1542419` Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 67.144 ? 15.187 ms/op GCPatchingNmethodCost.systemGC 0 5000 avgt 3 70.181 ? 30.271 ms/op GCPatchingNmethodCost.youngGC 0 5000 avgt 3 7.906 ? 2.118 ms/op I'll check SpecJVM as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3582302388 From eastigeevich at openjdk.org Wed Nov 26 18:30:23 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 26 Nov 2025 18:30:23 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v9] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Fix linux-cross-compile aarch64 build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/d36be373..ae3b97e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=07-08 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From shade at openjdk.org Wed Nov 26 19:42:03 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Nov 2025 19:42:03 GMT Subject: RFR: 8357258: x86: Improve receiver type profiling reliability [v3] In-Reply-To: References: Message-ID: On Sat, 22 Nov 2025 21:42:49 GMT, John R Rose wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls >> - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls >> - Drop atomic counters >> - Initial version > > Code is good. Consider changing a name and adding documentation. Tests are still passing, ready for another review round, @rose00, @vnkozlov :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25305#issuecomment-3582957544 From dholmes at openjdk.org Wed Nov 26 20:03:58 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Nov 2025 20:03:58 GMT Subject: RFR: 8372380: Make hs_err reporting more robust for unattached threads [v3] In-Reply-To: References: Message-ID: <-MPtd7huBegCx4mR6padf46eMMAMDPs9tvnYsx8RVyE=.6cc4a769-6ead-4603-81df-68cbc9c9be26@github.com> On Tue, 25 Nov 2025 08:24:23 GMT, Aleksey Shipilev wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix include order > > All right then! Thanks for the reviews @shipilev , @xmas92 and @kevinjwalls ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28470#issuecomment-3583019242 From dholmes at openjdk.org Wed Nov 26 20:04:00 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Nov 2025 20:04:00 GMT Subject: Integrated: 8372380: Make hs_err reporting more robust for unattached threads In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 02:14:15 GMT, David Holmes wrote: > There were a number of places where the code called by hs_err reporting assumed/required an attached current thread. These would then cause secondary failures during hs_err reporting. Using a simple example of an unattached thread causing a SEGV I went through and eliminated all the problems I encountered. In some cases the thread dependency was obvious and easy to address directly, but in others we just skip that section at the top-level. > > Testing: > - manual inspection of hs_err file, for different GCs > - tiers 1-3 sanity > > Thanks This pull request has now been integrated. Changeset: 6e920fbd Author: David Holmes URL: https://git.openjdk.org/jdk/commit/6e920fbdab17201886804bb53b59188b362f541d Stats: 11 lines in 4 files changed: 6 ins; 0 del; 5 mod 8372380: Make hs_err reporting more robust for unattached threads Reviewed-by: shade, aboldtch, kevinw ------------- PR: https://git.openjdk.org/jdk/pull/28470 From macarte at openjdk.org Wed Nov 26 20:30:12 2025 From: macarte at openjdk.org (Mat Carter) Date: Wed, 26 Nov 2025 20:30:12 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v10] In-Reply-To: References: Message-ID: <4yg-7y-d9fzKeFZwZm1KNF6xn2tXGk8y8QaUCk6ly0w=.d0a6ba6b-81a6-482c-9af6-fe3584f2de1d@github.com> > Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. > > The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE > > It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: > > TRUE > FALSE > > Passes tier1 on linux (x64) and windows (x64) Mat Carter has updated the pull request incrementally with one additional commit since the last revision: Incorporate changes to aotMetaspace from dependent commit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28010/files - new: https://git.openjdk.org/jdk/pull/28010/files/a12bfa03..576418a6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28010/head:pull/28010 PR: https://git.openjdk.org/jdk/pull/28010 From macarte at openjdk.org Wed Nov 26 20:35:11 2025 From: macarte at openjdk.org (Mat Carter) Date: Wed, 26 Nov 2025 20:35:11 GMT Subject: RFR: 8369736 - Add management interface for AOT cache creation [v11] In-Reply-To: References: Message-ID: > Add jdk.management.AOTCacheMXBean. The interface provides a single action that when called will cause any hosted JVM currently recording AOT information will stop recording. Existing functionality is preserved: when stopped the JVM will create the required artifacts based on the execution mode. Conveniently as the application running on the JVM has not stopped (as was previously the only way to stop recording), the application will resume execution after the artifacts have been generated. > > The interface will return TRUE if a recording was successfully stopped, in all other cases (not recording etc.) will return FALSE > > It follows that invoking the action on a JVM that is recording, twice in succession, should (baring internal errors) produce the following two responses: > > TRUE > FALSE > > Passes tier1 on linux (x64) and windows (x64) Mat Carter has updated the pull request incrementally with one additional commit since the last revision: Fixed spaces and CRLF ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28010/files - new: https://git.openjdk.org/jdk/pull/28010/files/576418a6..0086e0ec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28010&range=09-10 Stats: 93 lines in 2 files changed: 0 ins; 0 del; 93 mod Patch: https://git.openjdk.org/jdk/pull/28010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28010/head:pull/28010 PR: https://git.openjdk.org/jdk/pull/28010 From dhanalla at openjdk.org Wed Nov 26 21:06:26 2025 From: dhanalla at openjdk.org (Dhamoder Nalla) Date: Wed, 26 Nov 2025 21:06:26 GMT Subject: RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 [v3] In-Reply-To: References: Message-ID: <6lET0hX7bDMhtRlvnIw2R4Df_pcIrihcHNjYrqKvTMI=.72367a3f-a937-45ff-9f96-f7a2cb6f2ad4@github.com> > This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial. > Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (?0, subnormals, negatives, NaN). > > > > The micro-benchmark results from MathBench and StrictMathBench below show the performance improvement of Math.log: > > > **Before change** > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" > xmlns="http://www.w3.org/TR/REC-html40"> > > > > > > > > > > >
    > >
    > >
    > >
    > > Benchmark | Mode | Cnt | Score | Error | Units > -- | -- | -- | -- | -- | -- > MathBench.logDouble | thrpt | 10 | **15549.705** | ?357.439 | ops/ms > StrictMathBench.logDouble | thrpt | 10 | 219408.158 | ?16484.680 | ops/ms > >
    > >
    > >
    > >
    > > > > > > > > > **After adding Math.log intrinsic** > > xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" > xmlns="http://www.w3.org/TR/REC-html40"> > > > > > > > > > > >
    > >
    > >
    > >
    > > Benchmark | Mode | Cnt | Score | Error | Units > -- | -- | -- | -- | -- | -- > MathBench.logDouble | thrpt | 10 | **300086.773** | ?6675.936 | ops/ms > StrictMathBench.logDouble | thrpt | 10 | 226521.817 | ?4038.975 | ops/ms > > >
    > >
    > >
    > >
    > > > > > Dhamoder Nalla has updated the pull request incrementally with one additional commit since the last revision: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28306/files - new: https://git.openjdk.org/jdk/pull/28306/files/06b3dd4d..60a0b8e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28306&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28306&range=01-02 Stats: 3 lines in 2 files changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28306.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28306/head:pull/28306 PR: https://git.openjdk.org/jdk/pull/28306 From pchilanomate at openjdk.org Wed Nov 26 22:34:37 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:34:37 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v8] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision: - More changes from Coleen's review - Drop VTMS from names ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/dee2b843..623bc518 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=06-07 Stats: 203 lines in 16 files changed: 41 ins; 0 del; 162 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Wed Nov 26 22:44:17 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:44:17 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v9] In-Reply-To: References: Message-ID: > When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don?t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don?t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes. > > This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes: > > - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the ?in transition? bits for the carrier and vthread *before* reading the ?transition disabled? counters. The thread requesting to disable transitions increments the ?transition disabled? counter *before* reading the ?in transition? bits. > An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version. > > - Ending the transition doesn?t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`. > > - The code was previously structured in terms of mount and un... Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: missing to initialize _is_disabler_at_start ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28361/files - new: https://git.openjdk.org/jdk/pull/28361/files/623bc518..7aa02a46 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28361&range=07-08 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28361.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28361/head:pull/28361 PR: https://git.openjdk.org/jdk/pull/28361 From pchilanomate at openjdk.org Wed Nov 26 22:44:18 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:44:18 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v3] In-Reply-To: References: <-MtOiSQVDvlQD7sbfeBiqF00_ZN9_aNt3zd2LZLljyo=.eeabb717-359d-4420-89aa-ed1b305beee5@github.com> Message-ID: On Wed, 26 Nov 2025 07:29:37 GMT, David Holmes wrote: >> I?d prefer to leave it as a plain store to avoid the unnecessary extra fence. > > But it isn't then an atomic update. Only the disablers write to this counter while holding `VThreadTransition_lock` (verified in the assert above). But we still need to use `AtomicAccess` for the store because it can be concurrently read by the virtual thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566654448 From pchilanomate at openjdk.org Wed Nov 26 22:51:55 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:51:55 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: On Tue, 25 Nov 2025 23:10:43 GMT, Coleen Phillimore wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> keep preexisting rebind order for mount > > src/hotspot/share/classfile/javaClasses.cpp line 1688: > >> 1686: int java_lang_Thread::_jvmti_thread_state_offset; >> 1687: int java_lang_Thread::_VTMS_transition_disable_count_offset; >> 1688: int java_lang_Thread::_is_in_VTMS_transition_offset; > > Since you're renaming these anyway, can we drop the VTMS part? Just call it vthread_transition_disable_count_offset and is_in_vthread_transition_offset? There are other VTMS named things that aren't these flags but they can stay. Maybe migrate other names at some future point. I dropped VTMS from all names. > src/hotspot/share/opto/library_call.cpp line 3046: > >> 3044: } >> 3045: >> 3046: bool LibraryCallKit::inline_native_vthread_start_transition(address funcAddr, const char* funcName, bool is_final_transition) { > > Would it be helpful to add a comment above this to say what this does? This is supposed to match some non-intrinsic code and might be helpful if you referenced that here. Added a comment. > src/hotspot/share/prims/jvm.cpp line 3671: > >> 3669: >> 3670: JVM_ENTRY(void, JVM_VirtualThreadStartFinalTransition(JNIEnv* env, jobject vthread)) >> 3671: oop vt = JNIHandles::resolve_external_guard(vthread); > > Why do the opto runtime versions set is_in_VTMTS_transition in both the java.lang.Thread and JavaThread and these don't? Because we set them in the intrinsic when trying to start the transition. Method `MountUnmountDisabler::start_transition` expects them to be false so we need to clear them in the opto versions. > src/hotspot/share/runtime/mountUnmountDisabler.hpp line 34: > >> 32: >> 33: class MountUnmountDisabler : public AnyObj { >> 34: static volatile int _global_start_transition_disable_count; > > Can you describe this variable - when is it set and why is there a global disabler? What does it mean to have 'n' active disablers? > > A comment at the beginning of MountUnmountDisabler to say something of the effect that during virtual thread mounting and unmounting, JVMTI and operations that need to examine thread state need to be disabled. Or is it the converse? During JVMTI and operations that examine the state of threads, virtual thread mounting and unmounting must wait until these operations are complete. This class is for the latter right? Added a comment for the class and this counter. > src/hotspot/share/runtime/mutexLocker.cpp line 52: > >> 50: Mutex* JvmtiThreadState_lock = nullptr; >> 51: Monitor* EscapeBarrier_lock = nullptr; >> 52: Monitor* VTMSTransition_lock = nullptr; > > oh you could drop the name VTMS and call it VThreadTransitionLock can't you? Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566661560 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566662105 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566663466 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566666104 PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566666238 From pchilanomate at openjdk.org Wed Nov 26 22:51:58 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:51:58 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v9] In-Reply-To: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: <9R5lVpD1GBtUw9g9Bc5X7wSEI2a-oFM2Q29HUmyqSmc=.5fb087cf-4305-4bf1-b730-8a3bda7fbe9a@github.com> On Tue, 25 Nov 2025 23:27:47 GMT, Coleen Phillimore wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> missing to initialize _is_disabler_at_start > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1772: > >> 1770: >> 1771: assert(java_thread != nullptr, "sanity check"); >> 1772: assert(!java_thread->is_in_VTMS_transition(), "sanity check"); > > Why don't you need these asserts anymore? We can?t assert this because it could be temporarily set by the target while trying to transition. Previously we had two fields in JavaThread, `_VTMS_transition_mark` and `_is_in_VTMS_transition`. `_VTMS_transition_mark` was set first (checked by the disabler), and if transitions were disabled we waited. Once the transition could start we set `_is_in_VTMS_transition`. Going over the changes I see I removed one assert in `JvmtiEnvBase::get_vthread_jvf` that should be okay to keep, so I restored it. Also added an assert in `JavaThread::is_in_VTMS_transition()` (now `is_in_vthread_transition`) to verify that if it?s accessed from another thread then it has to be done from a safe context where the value will not change?right after checking. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566664593 From pchilanomate at openjdk.org Wed Nov 26 22:51:59 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 26 Nov 2025 22:51:59 GMT Subject: RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v7] In-Reply-To: References: <0e65HV5RscFZN_q4JGzXA7k5jlT7gw7klerMqbfz4GU=.598cedb2-fd53-458b-9047-4d712661cbe4@github.com> Message-ID: On Tue, 25 Nov 2025 23:45:14 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/javaThread.cpp line 1152: >> >>> 1150: bool JavaThread::is_in_VTMS_transition() const { >>> 1151: return AtomicAccess::load(&_is_in_VTMS_transition); >>> 1152: } >> >> Is the JavaThread version always the same as the java_lang_Thread::is_in_VTMS_transition(threadOop()) value? > > Why is there the same flag with the same name in both the Java class and C++ JavaThread? Might be an efficient cache, so something should say that (if true). The one in `JavaThread` is needed for the `disable_transition_for_all` case. Processing each vthread is not viable, so we instead process all `JavaThreads`. If no `JavaThread` is in a transition then it implies no vthread is in a transition. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2566665257 From xgong at openjdk.org Thu Nov 27 01:50:33 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 27 Nov 2025 01:50:33 GMT Subject: RFR: 8372136: VectorAPI: Refactor subword gather load API java implementation Message-ID: The current subword (`byte`/`short`) gather load API implementation is not well-suited for platforms that provide native vector instructions for these operations. As **discussed in PR [1]**, we'd like to re-implement these APIs with a **unified cross-platform** solution. The main idea is to re-implement the API at Java-level, by performing multiple sub-gather operations. Each sub-gather operation loads a portion of elements using a specific index vector by calling the HotSpot intrinsic API. The partial results are then merged using vector `slice` and `or` operations. This design simplifies the VM compiler intrinsic implementation and better aligns with the Vector API design principles. Key changes: 1. Re-implement the subword gather load API at the Java level. The HotSpot intrinsic `VectorSupport.loadWithMap` is simplified by reducing the vector index parameters from four (vix1-vix4) to a single parameter. 2. Adjust the compiler intrinsic implementation to support the new Java API, including updates to the x86 backend implementation. The performance impact varies across different scenarios on X86. I tested the performance with different AVX levels on an X86 machine that supports AVX512. To achieve optimal performance, I also **applied PR [2]**, which improves the performance of the **`slice()`** API on X86. Following is the summarized performance gains, where: - "non masked" means the gather operation is not the masked gather API. - "masked" means the gather operation is the masked gather API. - "1 gather cases" means the gather API is implemented with a single gather operation. E.g. Load `Short128Vector` with `MaxVectorSize=256`. - "2 gather cases" means the gather API is implemented with 2 parts of gather operations. E.g. Load `Short256Vector` with `MaxVectorSize=256`. - "4 gather cases" means the gather API is implemented with 4 parts of gather operations. E.g. Load `Byte256Vector` with `MaxVectorSize=256`. - "Un-intrinsified" means the gather operation is not supported to be intrinsified by hotspot. E.g. Load `Byte512Vector` with `MaxVectorSize=256`. The singificant performance uplifts comes from the Java-level changes which removes the vector index generation and range checks for such cases. ---------------------------------------------------------------------------- | UseAVX=3 | UseAVX=2 | |-----------------------------|-----------------------------| | non masked | masked | non masked | masked | |--------------|--------------|--------------|--------------| 1 gather cases | 0.99 ~ 1.06x | 0.94 ~ 1.11x | 0.94 ~ 1.00x | 0.99 ~ 1.11x | ---------------|--------------|--------------|--------------|--------------| 2 gather cases | 0.94 ~ 1.01x | 0.88 ~ 0.97x | 0.8 ~ 1.13x | 0.82 ~ 0.93x | ---------------|--------------|--------------|--------------|--------------| 4 gather cases | 0.92 ~ 0.95x | 0.84 ~ 0.88x | 0.98 ~ 1.06x | 0.81 ~ 0.92x | ---------------|--------------|--------------|--------------|--------------| Un-intrinsified| N/A | N/A | 1.48 ~ 1.65x | 1.1 ~ 1.53x | ---------------|--------------|--------------|--------------|--------------| There are performance regressions especially for APIs that need splitting and merging operations. And the regressions are more significant for the masked cases. This is caused by the additional vector/mask slice and merging operations in Java code, which I think is un-avoidable. Note-1: Compared with before, this patch **disables** the gather API intrinsification for **64-bit species** when **`MaxVectorSize=8`**, because it would generate a 16-bit vector, which is smaller than the supported minimum vector size of 32-bit. This limitation can be addressed by adjusting the IR pattern in the future. However, this requires significant refactoring of the X86 backend implementation, which is challenging for me. I'd like to leave this as a separate work. And it would be much more helpful if I can get any help from the X86 experts. Note-2: This patch only includes the refactoring of the Java API code and the HotSpot x86 backend implementation. A follow-up patch will add the support for the AArch64 SVE backend. [1] https://github.com/openjdk/jdk/pull/26236 [2] https://github.com/openjdk/jdk/pull/24104 ------------- Commit messages: - 8372136: VectorAPI: Refactor subword gather load API java implementation Changes: https://git.openjdk.org/jdk/pull/28520/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28520&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372136 Stats: 558 lines in 13 files changed: 383 ins; 78 del; 97 mod Patch: https://git.openjdk.org/jdk/pull/28520.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28520/head:pull/28520 PR: https://git.openjdk.org/jdk/pull/28520 From xgong at openjdk.org Thu Nov 27 01:50:33 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 27 Nov 2025 01:50:33 GMT Subject: RFR: 8372136: VectorAPI: Refactor subword gather load API java implementation In-Reply-To: References: Message-ID: <4dKiP3u-Kcutl_lVtumJdVmeC1r2ZAJR_YQwuE7sxV8=.bd847945-5ce3-4cff-973e-7d4421f837c1@github.com> On Thu, 27 Nov 2025 01:42:07 GMT, Xiaohong Gong wrote: > The current subword (`byte`/`short`) gather load API implementation is not well-suited for platforms that provide native vector instructions for these operations. As **discussed in PR [1]**, we'd like to re-implement these APIs with a **unified cross-platform** solution. > > The main idea is to re-implement the API at Java-level, by performing multiple sub-gather operations. Each sub-gather operation loads a portion of elements using a specific index vector by calling the HotSpot intrinsic API. The partial results are then merged using vector `slice` and `or` operations. This design simplifies the VM compiler intrinsic implementation and better aligns with the Vector API design principles. > > Key changes: > 1. Re-implement the subword gather load API at the Java level. The HotSpot intrinsic `VectorSupport.loadWithMap` is simplified by reducing the vector index parameters from four (vix1-vix4) to a single parameter. > 2. Adjust the compiler intrinsic implementation to support the new Java API, including updates to the x86 backend implementation. > > The performance impact varies across different scenarios on X86. I tested the performance with different AVX levels on an X86 machine that supports AVX512. To achieve optimal performance, I also **applied PR [2]**, which improves the performance of the **`slice()`** API on X86. Following is the summarized performance gains, where: > > - "non masked" means the gather operation is not the masked gather API. > - "masked" means the gather operation is the masked gather API. > - "1 gather cases" means the gather API is implemented with a single gather operation. E.g. Load `Short128Vector` with `MaxVectorSize=256`. > - "2 gather cases" means the gather API is implemented with 2 parts of gather operations. E.g. Load `Short256Vector` with `MaxVectorSize=256`. > - "4 gather cases" means the gather API is implemented with 4 parts of gather operations. E.g. Load `Byte256Vector` with `MaxVectorSize=256`. > - "Un-intrinsified" means the gather operation is not supported to be intrinsified by hotspot. E.g. Load `Byte512Vector` with `MaxVectorSize=256`. The singificant performance uplifts comes from the Java-level changes which removes the vector index generation and range checks for such cases. > > > ---------------------------------------------------------------------------- > | UseAVX=3 | UseAVX=2 | > |-----------------------------|-----------------------------| > | non maske... Following is the performance changes of JMH `org.openjdk.bench.jdk.incubator.vector.GatherOperationsBenchmark` with **-XX:UseAVX=2|3** relatively. image image image image ------------- PR Comment: https://git.openjdk.org/jdk/pull/28520#issuecomment-3583900604 From serb at openjdk.org Thu Nov 27 01:51:52 2025 From: serb at openjdk.org (Sergey Bylokhov) Date: Thu, 27 Nov 2025 01:51:52 GMT Subject: RFR: 8345265: Minor improvements for LTO across all compilers [v2] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 07:59:24 GMT, Julian Waters wrote: >gcc documentation states that you typically need to pass the same options to the link step from the compile step, since the linker when LTO is active is actually the compiler in disguise. So it seems a pre-existing bug which affects the old jdk updates as well? probably we can extract that change and fix it separately and then backport? or do you plan to backport this one? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22464#issuecomment-3583903828 From xgong at openjdk.org Thu Nov 27 02:00:48 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 27 Nov 2025 02:00:48 GMT Subject: RFR: 8372136: VectorAPI: Refactor subword gather load API java implementation In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 01:42:07 GMT, Xiaohong Gong wrote: > The current subword (`byte`/`short`) gather load API implementation is not well-suited for platforms that provide native vector instructions for these operations. As **discussed in PR [1]**, we'd like to re-implement these APIs with a **unified cross-platform** solution. > > The main idea is to re-implement the API at Java-level, by performing multiple sub-gather operations. Each sub-gather operation loads a portion of elements using a specific index vector by calling the HotSpot intrinsic API. The partial results are then merged using vector `slice` and `or` operations. This design simplifies the VM compiler intrinsic implementation and better aligns with the Vector API design principles. > > Key changes: > 1. Re-implement the subword gather load API at the Java level. The HotSpot intrinsic `VectorSupport.loadWithMap` is simplified by reducing the vector index parameters from four (vix1-vix4) to a single parameter. > 2. Adjust the compiler intrinsic implementation to support the new Java API, including updates to the x86 backend implementation. > > The performance impact varies across different scenarios on X86. I tested the performance with different AVX levels on an X86 machine that supports AVX512. To achieve optimal performance, I also **applied PR [2]**, which improves the performance of the **`slice()`** API on X86. Following is the summarized performance gains, where: > > - "non masked" means the gather operation is not the masked gather API. > - "masked" means the gather operation is the masked gather API. > - "1 gather cases" means the gather API is implemented with a single gather operation. E.g. Load `Short128Vector` with `MaxVectorSize=256`. > - "2 gather cases" means the gather API is implemented with 2 parts of gather operations. E.g. Load `Short256Vector` with `MaxVectorSize=256`. > - "4 gather cases" means the gather API is implemented with 4 parts of gather operations. E.g. Load `Byte256Vector` with `MaxVectorSize=256`. > - "Un-intrinsified" means the gather operation is not supported to be intrinsified by hotspot. E.g. Load `Byte512Vector` with `MaxVectorSize=256`. The singificant performance uplifts comes from the Java-level changes which removes the vector index generation and range checks for such cases. > > > ---------------------------------------------------------------------------- > | UseAVX=3 | UseAVX=2 | > |-----------------------------|-----------------------------| > | non maske... Hi @iwanowww , @PaulSandoz , @sviswa7, @jatin-bhateja, this is a refactoring patch for subword gather-load APIs together with the X86 changes as we discussed in https://github.com/openjdk/jdk/pull/26236. Could you please help take a look? Since I'm not quite familiar with X86 instructions, any feedback or help from @sviswa7 or @jatin-bhateja would be much helpful. There are performance regressions with current version, but I think it still has improvement opportunities for the X86 codegen. Hence, I'd appreciate for any help on that! Thanks a lot in advance! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28520#issuecomment-3583917222 From david.holmes at oracle.com Thu Nov 27 02:05:34 2025 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Nov 2025 12:05:34 +1000 Subject: Any reason why +PrintFlagsFinal requires unlocking experimental and diagnostic flags to print their default values? In-Reply-To: <849ebf93-c3fb-45c9-92c8-21a49b3e9946@redhat.com> References: <849ebf93-c3fb-45c9-92c8-21a49b3e9946@redhat.com> Message-ID: <5ff2a0a5-9d61-466d-9ada-5ba911185815@oracle.com> On 27/11/2025 12:53 am, Frederic Thevenet wrote: > Hi, > > Currently, using +PrintFlagsFinal prints out all JVM flags and their > values, even if they were not modified from their default, except for > 'locked' flags, i.e. Experimental and Diagnotic flags. In order to have > those printed out as well, one must first 'unlock' them (with > +UnlockExperimentalVMOptions, for instance). I think this was simply a pragmatic decision to avoid overwhelming the user with information that should not be relevant. > Now, is their a strong reason for not always displaying the default > values for those in scenarios were there is no concerns that the output > might be too large (that is when calling upon 'JVMFlag::printFlags' with > 'skipDefaults' set to false, like PrintFlagsFinal does)? > > The reason for this question is that when chasing a bug in scenarios > where one can only rely on logs or output provided by tools that uses > +PrintFlagsFinal, getting the default values *in the conditions that > those logs where produced* can be tricky as it depends on the exact > version of the JDK that was running, and some values can be changed by > ergonomics. Ouch. I think that would be a poor design choice for diagnostic, and especially experimental flags! > If you need to know the default for experimental flags -- which given > their nature can and do change often -- your choices are to either ask > for these logs to be generated again using +UnlockExperimentalVMOptions > (even if there is no intention of changing an experimental flag) or to > go on a time consuming deep dive into the code base for the exact > version of the JDK that was used. Neither is ideal. True, but for experimental flags in particular, unless you are deep diving into the code how can you know whether a particular flag and its value are relevant to your debugging in the first place? That said, I don't see any harm in providing a way to print all flags, though whether by default or by a new -XX:PrintAllFlagsFinal flag, I'm not sure. Cheers, David > Please note that in cases where unchanged flags are not printed out > (like in an hs_err report), no longer requiring? to unlock all flags to > print them out would have? no side effect, i.e. it would not increase > the amount of noise in the report (as in this case only modified flags > are printed out in, and for experimental flags this can only happen if > '+UnlockExperimentalVMOptions' is set to begin with). > > Regards, > From dholmes at openjdk.org Thu Nov 27 02:42:49 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Nov 2025 02:42:49 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange In-Reply-To: <9WmhQ889qBk7gNvsBKMNESdi1A79MFh98EP-WMVH5tc=.d6e81a56-b9b9-46d7-89f7-8e0fdb1678e9@github.com> References: <9WmhQ889qBk7gNvsBKMNESdi1A79MFh98EP-WMVH5tc=.d6e81a56-b9b9-46d7-89f7-8e0fdb1678e9@github.com> Message-ID: On Wed, 26 Nov 2025 09:22:07 GMT, Kim Barrett wrote: > @dholmes-ora might remember. I think it was just deferred to an indeterminate future release and was forgotten about. It is a very small cleanup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28498#issuecomment-3583989625 From jwaters at openjdk.org Thu Nov 27 04:26:51 2025 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 27 Nov 2025 04:26:51 GMT Subject: RFR: 8345265: Minor improvements for LTO across all compilers [v2] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 01:49:07 GMT, Sergey Bylokhov wrote: > > gcc documentation states that you typically need to pass the same options to the link step from the compile step, since the linker when LTO is active is actually the compiler in disguise. > > So it seems a pre-existing bug which affects the old jdk updates as well? probably we can extract that change and fix it separately and then backport? or do you plan to backport this one? Hmm, I didn't create this with the idea of backporting it in mind. Should I do that? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22464#issuecomment-3584144134 From shade at openjdk.org Thu Nov 27 05:32:56 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Nov 2025 05:32:56 GMT Subject: Integrated: 8372285: G1: Micro-optimize x86 barrier code In-Reply-To: References: Message-ID: <5wcNnZXC3qNWm375kSVxTwt1GpPWH_kdN55AwMzASks=.7403ccb8-9eac-4b0c-b39b-b8a8428c31d4@github.com> On Fri, 21 Nov 2025 09:06:54 GMT, Aleksey Shipilev wrote: > We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through. > > The patch is deliberately on simpler side, so we can backport it to 25u, if need arises. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` This pull request has now been integrated. Changeset: 848c0c79 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/848c0c79b69c489db6c6bbb24644134fe33fd0ec Stats: 90 lines in 1 file changed: 21 ins; 29 del; 40 mod 8372285: G1: Micro-optimize x86 barrier code Reviewed-by: tschatzl, ayang ------------- PR: https://git.openjdk.org/jdk/pull/28446 From duke at openjdk.org Thu Nov 27 08:39:55 2025 From: duke at openjdk.org (Harshit470250) Date: Thu, 27 Nov 2025 08:39:55 GMT Subject: RFR: 8347396: Efficient TypeFunc creations [v2] In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 05:11:47 GMT, Harshit470250 wrote: >> This PR do similar changes done by [JDK-8330851](https://bugs.openjdk.org/browse/JDK-8330851) on the GC TypeFunc creation as suggested by [JDK-8347396](https://bugs.openjdk.org/browse/JDK-8347396). As discussed in [https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686,](https://github.com/openjdk/jdk/pull/21782#discussion_r1906535686) I have put guard on the shenandoah gc specific part of the code. > > Harshit470250 has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge master > - update make_barrier_type > - Merge branch 'openjdk:master' into new_pr > - Merge branch 'openjdk:master' into new_pr > - My chages I have made some changes [here](https://github.com/Harshit470250/jdk/pull/2). Should I bring these changes here or should I do something else? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27279#issuecomment-3584730627 From kbarrett at openjdk.org Thu Nov 27 08:43:22 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 27 Nov 2025 08:43:22 GMT Subject: RFR: 8372650: Convert GenericWaitBarrier to use Atomic Message-ID: Please review this change to GenericWaitBarrier to use Atomic rather than directly applying AtomicAccess to volatile members. Testing: mach5 tier1-5 ------------- Commit messages: - convert GenericWaitBarrier to use Atomic Changes: https://git.openjdk.org/jdk/pull/28527/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28527&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372650 Stats: 23 lines in 2 files changed: 1 ins; 1 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/28527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28527/head:pull/28527 PR: https://git.openjdk.org/jdk/pull/28527 From jsikstro at openjdk.org Thu Nov 27 09:42:04 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 27 Nov 2025 09:42:04 GMT Subject: RFR: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages [v5] In-Reply-To: References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: On Wed, 26 Nov 2025 11:02:10 GMT, Joel Sikstr?m wrote: >> Hello, >> >> Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. >> >> For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. >> >> However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. >> >> Testing: >> * Oracle's tier1-8 >> * tier1-4 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Re-order methods for consistency in class hierarchy Thank you for the reviews everyone! I've re-run the testing listed in PR description and done some local testing with different large page sizes, and it looks good. Some concerns for the future is that tests _might_ behave strangely or perhaps fail when run with Parallel, UseLargePages and large page sizes >2MB. We'll keep this in mind moving forward, either adjusting the tests to the new behavior or excluding tests when run with large pages. The failing test in GHA is due to an unrelated failure (see [JDK-8372585](https://bugs.openjdk.org/browse/JDK-8372585)). ------------- PR Comment: https://git.openjdk.org/jdk/pull/28394#issuecomment-3584949256 From jsikstro at openjdk.org Thu Nov 27 09:42:06 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 27 Nov 2025 09:42:06 GMT Subject: Integrated: 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages In-Reply-To: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> References: <_nJ3sxtl4WL88gpE1VQgR-lSry1K_t60lqc8am2cWzU=.09c65da8-9cae-41c5-9eac-ee9172149b39@github.com> Message-ID: On Wed, 19 Nov 2025 15:44:57 GMT, Joel Sikstr?m wrote: > Hello, > > Today, Parallel decides to opt out of using Large pages if the heap size, either minimum, initial or maximum, does not cover enough Large pages for all spaces. Additionally, if we don't get enough heap size for at least one OS page per MutableNUMASpace (one per NUMA-node), Parallel decides to run in a NUMA-degraded mode, where it skips allocating memory locally for some NUMA-nodes. Both of these issues are problematic if we want to start the JVM with a default initial heap size that is equal to the minimum heap size (see [JDK-8371986](https://bugs.openjdk.org/browse/JDK-8371986)). To solve this, we should consider making sure that the minimum heap size is always enough to cover precisely one page per space, where the page size may be Large or not. > > For completeness, when user-provided settings for UseNUMA, UseLargePages and heap sizes can't be satisfied at the same time, one must be prioritised over others. Today, we prioritise heap size settings over both UseNUMA and UseLargePages. This change suggest shifting the (primary) priority to UseNUMA and UseLargePages, by bumping MinHeapSize, InitialHeapSize and MaxHeapSize to an adequate number, if not already enough. By bumping the minimum heap size to an adequate number, we are also bumping the lower-limit for the initial heap size and maximum heap size, which must be equal to or greater than the minimum heap size. > > However, a problem with this approach is that if the Large page size is very large (e.g., 512MB or 1GB), the minimum, initial, and maybe the maximum heap size will be bumped to a very large number as well. To mitigate this impact, we look at what Large page size can be used based on the maximum heap size instead. This is because, running the JVM in default configuration, the maximum heap size will almost always be large enough to cover enough Large pages, so we bump the minimum and initial to that value instead. But, if the maximum heap size is not enough, we opt-out of using Large pages, which is consistent with the old behavior. > > Testing: > * Oracle's tier1-8 > * tier1-4 with the flags `-XX:+UseParallelGC -XX:+UseLargePages -XX:+UseNUMA` This pull request has now been integrated. Changeset: 4ac33956 Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/4ac33956343bbfa3619ccb029ceed6c5a402f775 Stats: 201 lines in 11 files changed: 97 ins; 84 del; 20 mod 8372150: Parallel: Tighten requirements around heap sizes with NUMA and Large Pages Reviewed-by: ayang, stefank, aboldtch, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/28394 From jsikstro at openjdk.org Thu Nov 27 10:02:19 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 27 Nov 2025 10:02:19 GMT Subject: RFR: 8371986: Remove the default value of InitialRAMPercentage Message-ID: Hello, This RFE changes the default of `InitialRAMPercentage` to 0, effectively removing its default value. Please see the CSR for specific details on this change. Changing the default value to 0 results in the behavior that the initial heap size (InitialHeapSize) is set to the minimum heap size (MinHeapSize). This is because of the following lines in `Arguments::set_heap_size()`, which takes the MAX value of MinHeapSize as well: uint64_t initial_memory = (uint64_t)(((double)MaxRAM * InitialRAMPercentage) / 100); ... reasonable_initial = MAX3(reasonable_initial, reasonable_minimum, MinHeapSize); reasonable_initial = MIN2(reasonable_initial, MaxHeapSize); This change improves startup performance for all GCs, but affects the time-to-peak performance in some out-of-the-box configurations for some GCs. This is mainly visible in ParallelGC. ------------- Commit messages: - 8371986: Remove the default value of InitialRAMPercentage Changes: https://git.openjdk.org/jdk/pull/28530/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28530&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371986 Stats: 4 lines in 3 files changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28530.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28530/head:pull/28530 PR: https://git.openjdk.org/jdk/pull/28530 From roland at openjdk.org Thu Nov 27 10:03:56 2025 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 27 Nov 2025 10:03:56 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 15:46:20 GMT, Zihao Lin wrote: >> This patch remove slice parameter from LoadNode::make >> >> I have done more work which remove slice paramater from StoreNode::make. >> >> Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 >> >> Hi team, I am new, I'd appreciate any guidance. Thank a lot! > > Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: > > Fix test failed Changes requested by roland (Reviewer). src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.cpp line 720: > 718: if (ShenandoahCardBarrier) { > 719: post_barrier(kit, kit->control(), access.raw_access(), access.base(), > 720: access.addr(), access.alias_idx(), new_val, T_OBJECT, true); `access.alias_idx()` should be `C->get_alias_index(kit.gvn().type(access.addr()))` So I think we want to remove `uint _alias_idx;` from `C2AtomicParseAccess` as well. This could be done as a follow up if you think this change has already gotten too complicated. src/hotspot/share/opto/escape.cpp line 4488: > 4486: const TypePtr* adr_type = proj->adr_type(); > 4487: const TypePtr* new_adr_type = tinst->add_offset(adr_type->offset()); > 4488: if (adr_type != new_adr_type) { Can you explain that change? Did something go wrong in a merge? src/hotspot/share/opto/graphKit.cpp line 1703: > 1701: BasicType bt, > 1702: DecoratorSet decorators) { > 1703: C2AccessValuePtr addr(adr, adr_type); `adr_type` no longer used in this and next methods. ------------- PR Review: https://git.openjdk.org/jdk/pull/24258#pullrequestreview-3514352600 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2567870138 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2567854115 PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2567875036 From roland at openjdk.org Thu Nov 27 10:03:58 2025 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 27 Nov 2025 10:03:58 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v8] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 08:30:51 GMT, Zihao Lin wrote: >> Hi, I give it a try, but it failed pass the test. Is it possible the original version is wrong? >> The mark word will not be `TypeRawPtr::BOTTOM`, it should equal to Klass slice index. > > One dump `proto_adr ` is ` 1368 AddP === _ 196 196 1367 [[ ]] Klass:precise java/util/LinkedHashMap$Entry: 0x0000000918349ca0 (java/util/Map$Entry):Constant:exact+168 *` I think the original version was wrong indeed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2567848297 From fthevene at redhat.com Thu Nov 27 10:05:55 2025 From: fthevene at redhat.com (Frederic Thevenet) Date: Thu, 27 Nov 2025 11:05:55 +0100 Subject: Any reason why +PrintFlagsFinal requires unlocking experimental and diagnostic flags to print their default values? In-Reply-To: <5ff2a0a5-9d61-466d-9ada-5ba911185815@oracle.com> References: <849ebf93-c3fb-45c9-92c8-21a49b3e9946@redhat.com> <5ff2a0a5-9d61-466d-9ada-5ba911185815@oracle.com> Message-ID: Thanks for your comments, David. On 11/27/25 03:05, David Holmes wrote: > On 27/11/2025 12:53 am, Frederic Thevenet wrote: >> Hi, >> >> Currently, using +PrintFlagsFinal prints out all JVM flags and their >> values, even if they were not modified from their default, except for >> 'locked' flags, i.e. Experimental and Diagnotic flags. In order to >> have those printed out as well, one must first 'unlock' them (with >> +UnlockExperimentalVMOptions, for instance). > > I think this was simply a pragmatic decision to avoid overwhelming the > user with information that should not be relevant. This make sense. What I note is that there hasn't been a strong reason against it either, that would have eluded me so far. >> Now, is their a strong reason for not always displaying the default >> values for those in scenarios were there is no concerns that the >> output might be too large (that is when calling upon >> 'JVMFlag::printFlags' with 'skipDefaults' set to false, like >> PrintFlagsFinal does)? >> >> The reason for this question is that when chasing a bug in scenarios >> where one can only rely on logs or output provided by tools that uses >> +PrintFlagsFinal, getting the default values *in the conditions that >> those logs where produced* can be tricky as it depends on the exact >> version of the JDK that was running, and some values can be changed >> by ergonomics. > > Ouch. I think that would be a poor design choice for diagnostic, and > especially experimental flags! Right, I see your point. Even tough I originally didn't really see a case where doing so would be detrimental when looking at the current call sites for 'printFlags' where 'skipDefaults' is false, a blanket change on the established behaviour for this method would indeed be ill-advised. >> If you need to know the default for experimental flags -- which given >> their nature can and do change often -- your choices are to either >> ask for these logs to be generated again using >> +UnlockExperimentalVMOptions (even if there is no intention of >> changing an experimental flag) or to go on a time consuming deep dive >> into the code base for the exact version of the JDK that was used. >> Neither is ideal. > > True, but for experimental flags in particular, unless you are deep > diving into the code how can you know whether a particular flag and > its value are relevant to your debugging in the first place? > > That said, I don't see any harm in providing a way to print all flags, > though whether by default or by a new -XX:PrintAllFlagsFinal flag, I'm > not sure. Following my initial train of thoughts, my preference would go to changing the default behaviour for +PrintFlagsFinal, but adding a new flag would still be an improvement over the need to specify +UnlockDiagnosticVMOptions? and +UnlockExperimentalVMOptions. > Cheers, > David >> Please note that in cases where unchanged flags are not printed out >> (like in an hs_err report), no longer requiring? to unlock all flags >> to print them out would have? no side effect, i.e. it would not >> increase the amount of noise in the report (as in this case only >> modified flags are printed out in, and for experimental flags this >> can only happen if '+UnlockExperimentalVMOptions' is set to begin with). >> >> Regards, >> > Regards, -- Frederic Thevenet Senior Software Engineer - OpenJDK Red Hat France BAF5 C2D2 0BE0 1715 5EE1 0815 2065 AD47 B326 EB92 From azafari at openjdk.org Thu Nov 27 10:25:03 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 27 Nov 2025 10:25:03 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v7] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: On Thu, 9 Oct 2025 02:52:57 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed MAX2 template parameter > > Also I think this might need to account for the changes to `Arguments::set_heap_size` being done in https://github.com/openjdk/jdk/pull/27224 Thank you @dholmes-ora, @xmas92, @jdksjolen and @fandreuz for your reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26955#issuecomment-3585108751 From shade at openjdk.org Thu Nov 27 10:31:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Nov 2025 10:31:54 GMT Subject: RFR: 8372650: Convert GenericWaitBarrier to use Atomic In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 08:36:41 GMT, Kim Barrett wrote: > Please review this change to GenericWaitBarrier to use Atomic rather than > directly applying AtomicAccess to volatile members. > > Testing: mach5 tier1-5 Looks right, except one place: src/hotspot/share/utilities/waitBarrier_generic.cpp line 199: > 197: } > 198: } > 199: assert(_outstanding_wakeups.load_acquire() == 0, "Post disarm: Should not have outstanding wakeups"); Does not have to be `_acquire`, original load is relaxed. ------------- PR Review: https://git.openjdk.org/jdk/pull/28527#pullrequestreview-3514487147 PR Review Comment: https://git.openjdk.org/jdk/pull/28527#discussion_r2567965169 From azafari at openjdk.org Thu Nov 27 10:32:54 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 27 Nov 2025 10:32:54 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v7] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: On Thu, 9 Oct 2025 02:52:57 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed MAX2 template parameter > > Also I think this might need to account for the changes to `Arguments::set_heap_size` being done in https://github.com/openjdk/jdk/pull/27224 @dholmes-ora and @jdksjolen , re-reviews are required. TIA. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26955#issuecomment-3585140706 From roland at openjdk.org Thu Nov 27 12:29:56 2025 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 27 Nov 2025 12:29:56 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v6] In-Reply-To: References: Message-ID: > This is a variant of 8332827. In 8332827, an array access becomes > dependent on a range check `CastII` for another array access. When, > after loop opts are over, that RC `CastII` was removed, the array > access could float and an out of bound access happened. With the fix > for 8332827, RC `CastII`s are no longer removed. > > With this one what happens is that some transformations applied after > loop opts are over widen the type of the RC `CastII`. As a result, the > type of the RC `CastII` is no longer narrower than that of its input, > the `CastII` is removed and the dependency is lost. > > There are 2 transformations that cause this to happen: > > - after loop opts are over, the type of the `CastII` nodes are widen > so nodes that have the same inputs but a slightly different type can > common. > > - When pushing a `CastII` through an `Add`, if of the type both inputs > of the `Add`s are non constant, then we end up widening the type > (the resulting `Add` has a type that's wider than that of the > initial `CastII`). > > There are already 3 types of `Cast` nodes depending on the > optimizations that are allowed. Either the `Cast` is floating > (`depends_only_test()` returns `true`) or pinned. Either the `Cast` > can be removed if it no longer narrows the type of its input or > not. We already have variants of the `CastII`: > > - if the Cast can float and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and can't be removed when it doesn't narrow > the type of its input. > > What we need here, I think, is the 4th combination: > > - if the Cast can float and can't be removed when it doesn't narrow > the type of its input. > > Anyway, things are becoming confusing with all these different > variants named in ways that don't always help figure out what > constraints one of them operate under. So I refactored this and that's > the biggest part of this change. The fix consists in marking `Cast` > nodes when their type is widen in a way that prevents them from being > optimized out. > > Tobias ran performance testing with a slightly different version of > this change and there was no regression. Roland Westrelin has updated the pull request incrementally with two additional commits since the last revision: - review - review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24575/files - new: https://git.openjdk.org/jdk/pull/24575/files/2aa918e2..6bbda426 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=04-05 Stats: 23 lines in 2 files changed: 10 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/24575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24575/head:pull/24575 PR: https://git.openjdk.org/jdk/pull/24575 From roland at openjdk.org Thu Nov 27 12:33:45 2025 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 27 Nov 2025 12:33:45 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v7] In-Reply-To: References: Message-ID: <4vqKZmOZa_hvbbySegpobemqL5dNbz1qcvIlu96fjaQ=.0bb55f83-155e-4312-aee8-038ccaeb0a88@github.com> > This is a variant of 8332827. In 8332827, an array access becomes > dependent on a range check `CastII` for another array access. When, > after loop opts are over, that RC `CastII` was removed, the array > access could float and an out of bound access happened. With the fix > for 8332827, RC `CastII`s are no longer removed. > > With this one what happens is that some transformations applied after > loop opts are over widen the type of the RC `CastII`. As a result, the > type of the RC `CastII` is no longer narrower than that of its input, > the `CastII` is removed and the dependency is lost. > > There are 2 transformations that cause this to happen: > > - after loop opts are over, the type of the `CastII` nodes are widen > so nodes that have the same inputs but a slightly different type can > common. > > - When pushing a `CastII` through an `Add`, if of the type both inputs > of the `Add`s are non constant, then we end up widening the type > (the resulting `Add` has a type that's wider than that of the > initial `CastII`). > > There are already 3 types of `Cast` nodes depending on the > optimizations that are allowed. Either the `Cast` is floating > (`depends_only_test()` returns `true`) or pinned. Either the `Cast` > can be removed if it no longer narrows the type of its input or > not. We already have variants of the `CastII`: > > - if the Cast can float and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and be removed when it doesn't narrow the type > of its input. > > - if the Cast is pinned and can't be removed when it doesn't narrow > the type of its input. > > What we need here, I think, is the 4th combination: > > - if the Cast can float and can't be removed when it doesn't narrow > the type of its input. > > Anyway, things are becoming confusing with all these different > variants named in ways that don't always help figure out what > constraints one of them operate under. So I refactored this and that's > the biggest part of this change. The fix consists in marking `Cast` > nodes when their type is widen in a way that prevents them from being > optimized out. > > Tobias ran performance testing with a slightly different version of > this change and there was no regression. Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24575/files - new: https://git.openjdk.org/jdk/pull/24575/files/6bbda426..7a65f097 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24575&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24575/head:pull/24575 PR: https://git.openjdk.org/jdk/pull/24575 From roland at openjdk.org Thu Nov 27 12:33:46 2025 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 27 Nov 2025 12:33:46 GMT Subject: RFR: 8354282: C2: more crashes in compiled code because of dependency on removed range check CastIIs [v4] In-Reply-To: <4QQp7C7iIVfVs1MoUMC56KCgVGpXu5ziTHfZ-f2pk6o=.4ca7e1a8-3f31-44d3-aaec-30429ed7e2b0@github.com> References: <6qShqR-Ohv7vamoJ_B4Ev-poU8SB96eTBo4HFJrylcI=.dac5a26f-c9f0-445b-8f1c-a7c719fa27ae@github.com> <4QQp7C7iIVfVs1MoUMC56KCgVGpXu5ziTHfZ-f2pk6o=.4ca7e1a8-3f31-44d3-aaec-30429ed7e2b0@github.com> Message-ID: On Wed, 26 Nov 2025 14:29:06 GMT, Christian Hagedorn wrote: > Introducing a 4th dependency type looks reasonable. It's also nice to see one more refactoring in that area which makes it very expressive now. Thanks for doing that! I left some suggestions to possibly further improve the code. Thanks for the comments/suggestions. Updated change should take care of all of them. > src/hotspot/share/opto/castnode.hpp line 101: > >> 99: } >> 100: return NonFloatingNonNarrowing; >> 101: } > > Just a side note: We seem to mix the terms "(non-)pinned" with "(non-)floating" freely. Should we stick to just one? But maybe it's justified to use both depending on the situation/code context. The patch as it is now adds some extra uses of "pinned" and "floating". What could make sense, I suppose, would be to try to use "floating"/"non floating" instead but there are so many uses of "pinned" in the code base already, and I don't see us getting rid of them, that I wonder if it would make a difference. So, I'm not too sure what to do. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24575#issuecomment-3585614507 PR Review Comment: https://git.openjdk.org/jdk/pull/24575#discussion_r2568439115 From jsjolen at openjdk.org Thu Nov 27 12:58:00 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 27 Nov 2025 12:58:00 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v13] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: On Tue, 25 Nov 2025 10:19:38 GMT, Afshin Zafari wrote: >> The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. >> The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. >> >> Tests: >> linux-x64 tier1 > > Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add > - better type > - fix arguments.cpp for HeapMinBaseAddress type. > - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add > - removed redundant check of overflow. > - subtraction for checking overflow > - fixed MAX2 template parameter > - fixes. > - uintptr_t -> uint64_t > - fixes > - ... and 3 more: https://git.openjdk.org/jdk/compare/0ab6af85...56f8b1f3 Marked as reviewed by jsjolen (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26955#pullrequestreview-3515205954 From shade at openjdk.org Thu Nov 27 13:29:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Nov 2025 13:29:14 GMT Subject: RFR: 8371986: Remove the default value of InitialRAMPercentage In-Reply-To: References: Message-ID: <2usjgAiZCq5KUDw_wRjOcj2PkJX7S-Vx0fl0Dq3I_Xk=.41abf7d5-2662-4ae9-ad8b-d6f46b7ce1d4@github.com> On Thu, 27 Nov 2025 09:53:56 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE changes the default of `InitialRAMPercentage` to 0, effectively removing its default value. Please see the CSR for specific details on this change. > > Changing the default value to 0 results in the behavior that the initial heap size (InitialHeapSize) is set to the minimum heap size (MinHeapSize). This is because of the following lines in `Arguments::set_heap_size()`, which takes the MAX value of MinHeapSize as well: > > uint64_t initial_memory = (uint64_t)(((double)MaxRAM * InitialRAMPercentage) / 100); > ... > > reasonable_initial = MAX3(reasonable_initial, reasonable_minimum, MinHeapSize); > reasonable_initial = MIN2(reasonable_initial, MaxHeapSize); > > > This change improves startup performance for all GCs, but affects the time-to-peak performance in some out-of-the-box configurations for some GCs. This is mainly visible in ParallelGC. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28530#pullrequestreview-3515365711 From eastigeevich at openjdk.org Thu Nov 27 13:41:55 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 27 Nov 2025 13:41:55 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v3] In-Reply-To: <-cnMy4YHNCrKRqt_2Kkh9ksi-qE8ndZLB5yoyKkS3gM=.3f328f98-15a2-4736-9a6c-f9ab0705b830@github.com> References: <-cnMy4YHNCrKRqt_2Kkh9ksi-qE8ndZLB5yoyKkS3gM=.3f328f98-15a2-4736-9a6c-f9ab0705b830@github.com> Message-ID: On Tue, 25 Nov 2025 13:04:55 GMT, Andrew Haley wrote: >> Yeah patching all nmethods as one unit is basically equivalent to making the code cache processing a STW operation. Last time we processed the code cache STW was JDK 11. A dark place I don't want to go back to. It can get pretty big and mess up latency. So I'm in favour of limiting the fix and not re-introduce STW code cache processing. >> >> Otherwise yes you are correct; we perform synchronous cross modifying code with no assumptions about instruction cache coherency because we didn't trust it would actually work for all ARM implementations. Seems like that was a good bet. We rely on it on x64 still though. >> >> It's a bit surprising to me if they invalidate all TLB entries, effectively ripping out the entire virtual address space, even when a range is passed in. If so, a horrible alternative might be to use mprotect to temporarily remove execution permission on the affected per nmethod pages, and detect over shooting in the signal handler, resuming execution when execution privileges are then restored immediately after. That should limit the affected VA to close to what is actually invalidated. But it would look horrible. > >> It's a bit surprising to me if they invalidate all TLB entries, effectively ripping out the entire virtual address space, even when a range is passed in. If so, > > "Because the cache-maintenance wasn't needed, we can do the TLBI instead. > In fact, the I-Cache line-size isn't relevant anymore, we can reduce > the number of traps by producing a fake value. > > "For user-space, the kernel's work is now to trap CTR_EL0 to hide DIC, > and produce a fake IminLine. EL3 traps the now-necessary I-Cache > maintenance and performs the inner-shareable-TLBI that makes everything > better." > > My interpretation of this is that we only need to do the synchronization dance once, at the end of the patching. But I guess we don't know exactly if we have an affected core or if the kernel workaround is in action. @theRealAph @fisk As we have explicit synchronization for the patched code, I decided to run an experiment of deferred icache invalidation on Graviton 3(Neoverse V1). Graviton 3 does not have Neoverse N1 bug. It has hardware dcache and icache coherence. Such full hardware coherence means all `ICache:invalidate` operations are just a banch: dsb ish isb >From my experience of implementing spin pauses, we use `isb` for pauses. So our multiple `ICache:invalidate` are a bunch of pauses. Without deferred icache invalidation (baseline): Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 41.290 ? 7.596 ms/op GCPatchingNmethodCost.fullGC 2 5000 avgt 3 95.773 ? 6.059 ms/op GCPatchingNmethodCost.fullGC 4 5000 avgt 3 137.183 ? 12.896 ms/op GCPatchingNmethodCost.fullGC 8 5000 avgt 3 219.030 ? 19.101 ms/op GCPatchingNmethodCost.systemGC 0 5000 avgt 3 43.762 ? 3.818 ms/op GCPatchingNmethodCost.systemGC 2 5000 avgt 3 97.525 ? 8.434 ms/op GCPatchingNmethodCost.systemGC 4 5000 avgt 3 139.555 ? 17.159 ms/op GCPatchingNmethodCost.systemGC 8 5000 avgt 3 221.163 ? 8.908 ms/op GCPatchingNmethodCost.youngGC 0 5000 avgt 3 3.052 ? 2.823 ms/op GCPatchingNmethodCost.youngGC 2 5000 avgt 3 13.956 ? 1.984 ms/op GCPatchingNmethodCost.youngGC 4 5000 avgt 3 22.364 ? 0.626 ms/op GCPatchingNmethodCost.youngGC 8 5000 avgt 3 39.821 ? 0.241 ms/op With deferred icache invalidation: Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units GCPatchingNmethodCost.fullGC 0 5000 avgt 3 41.212 ? 10.914 ms/op GCPatchingNmethodCost.fullGC 2 5000 avgt 3 83.059 ? 17.115 ms/op GCPatchingNmethodCost.fullGC 4 5000 avgt 3 110.061 ? 2.642 ms/op GCPatchingNmethodCost.fullGC 8 5000 avgt 3 161.202 ? 5.750 ms/op GCPatchingNmethodCost.systemGC 0 5000 avgt 3 44.061 ? 7.586 ms/op GCPatchingNmethodCost.systemGC 2 5000 avgt 3 84.262 ? 11.852 ms/op GCPatchingNmethodCost.systemGC 4 5000 avgt 3 112.317 ? 3.907 ms/op GCPatchingNmethodCost.systemGC 8 5000 avgt 3 163.684 ? 9.732 ms/op GCPatchingNmethodCost.youngGC 0 5000 avgt 3 2.949 ? 0.626 ms/op GCPatchingNmethodCost.youngGC 2 5000 avgt 3 9.997 ? 1.334 ms/op GCPatchingNmethodCost.youngGC 4 5000 avgt 3 14.953 ? 1.121 ms/op GCPatchingNmethodCost.youngGC 8 5000 avgt 3 23.966 ? 1.656 ms/op Improvements: - 2 fields accessed - Full GC: 13% - System GC: 14% - Young GC: 28% - 4 fields accessed - Full GC: 20% - System GC: 20% - Young GC: 33% - 8 fields accessed - Full GC: 26% - System GC: 26% - Young GC: 40% ------------- PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3585923078 From duke at openjdk.org Thu Nov 27 15:32:00 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 27 Nov 2025 15:32:00 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 10:01:04 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix test failed > > src/hotspot/share/opto/graphKit.cpp line 1703: > >> 1701: BasicType bt, >> 1702: DecoratorSet decorators) { >> 1703: C2AccessValuePtr addr(adr, adr_type); > > `adr_type` no longer used in this and next methods. let's remove all unused adt_type in GraphKit ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2569258915 From duke at openjdk.org Thu Nov 27 16:01:57 2025 From: duke at openjdk.org (Zihao Lin) Date: Thu, 27 Nov 2025 16:01:57 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 09:59:31 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix test failed > > src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.cpp line 720: > >> 718: if (ShenandoahCardBarrier) { >> 719: post_barrier(kit, kit->control(), access.raw_access(), access.base(), >> 720: access.addr(), access.alias_idx(), new_val, T_OBJECT, true); > > `access.alias_idx()` should be `C->get_alias_index(kit.gvn().type(access.addr()))` > > So I think we want to remove `uint _alias_idx;` from `C2AtomicParseAccess` as well. This could be done as a follow up if you think this change has already gotten too complicated. I think we can create another task focus on remove `alias_idx` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2569382908 From liach at openjdk.org Thu Nov 27 16:23:21 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 27 Nov 2025 16:23:21 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting Message-ID: Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. Paging @minborg who requested Optional folding for review. I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. ------------- Commit messages: - Spurious change - Merge branch 'master' of https://github.com/openjdk/jdk into feature/class-final-trusting - Issue number and test update - Fixed optional and unit test - Merge branch 'master' of https://github.com/openjdk/jdk into feature/class-final-trusting - Stage Changes: https://git.openjdk.org/jdk/pull/28540/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28540&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372696 Stats: 183 lines in 13 files changed: 169 ins; 13 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28540.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28540/head:pull/28540 PR: https://git.openjdk.org/jdk/pull/28540 From liach at openjdk.org Thu Nov 27 16:27:52 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 27 Nov 2025 16:27:52 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 16:16:05 GMT, Chen Liang wrote: > Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. > > They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. > > We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. > > Paging @minborg who requested Optional folding for review. > > I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. This uses another one of the 16-bit instanceKlassFlags, which requires runtime engineers to agree. Need compiler review to check if such IR tests are the best way to ensure constant folding for core library classes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28540#issuecomment-3586679853 From jvernee at openjdk.org Thu Nov 27 16:50:48 2025 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 27 Nov 2025 16:50:48 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: <0f6w-e-F6PVzyBNmFsu37oNVKgKSxNwQMfA1Y2GC46c=.d196d665-deeb-432c-b089-a4f5494b44ec@github.com> On Thu, 27 Nov 2025 16:16:05 GMT, Chen Liang wrote: > Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. > > They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. > > We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. > > Paging @minborg who requested Optional folding for review. > > I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. ------------- Marked as reviewed by jvernee (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28540#pullrequestreview-3516377492 From aph at openjdk.org Thu Nov 27 17:11:18 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 27 Nov 2025 17:11:18 GMT Subject: RFR: 8372701: Randomized profile counters Message-ID: Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. Profile counters scale very badly. The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: Benchmark (randomized) Mode Cnt Score Error Units InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. ------------- Commit messages: - whitespace - AArch64 - Minimize deltas to master - Better - Inter - Cleanup - Cleanup - D'oh - More - More - ... and 40 more: https://git.openjdk.org/jdk/compare/9a88d7f4...66ea5872 Changes: https://git.openjdk.org/jdk/pull/28541/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28541&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372701 Stats: 1122 lines in 26 files changed: 892 ins; 39 del; 191 mod Patch: https://git.openjdk.org/jdk/pull/28541.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28541/head:pull/28541 PR: https://git.openjdk.org/jdk/pull/28541 From aph at openjdk.org Thu Nov 27 17:18:35 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 27 Nov 2025 17:18:35 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: > Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. > > Profile counters scale very badly. > > The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. > > For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: > > > Benchmark (randomized) Mode Cnt Score Error Units > InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op > InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op > > > This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 52 commits: - Merge remote-tracking branch 'refs/remotes/origin/JDK-8134940' into JDK-8134940 - whitespace - AArch64 - Minimize deltas to master - Better - Inter - Cleanup - Cleanup - Merge master - D'oh - ... and 42 more: https://git.openjdk.org/jdk/compare/b2f97131...49d52d82 ------------- Changes: https://git.openjdk.org/jdk/pull/28541/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28541&range=01 Stats: 1150 lines in 26 files changed: 921 ins; 38 del; 191 mod Patch: https://git.openjdk.org/jdk/pull/28541.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28541/head:pull/28541 PR: https://git.openjdk.org/jdk/pull/28541 From shade at openjdk.org Thu Nov 27 17:18:38 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Nov 2025 17:18:38 GMT Subject: RFR: 8372701: Randomized profile counters In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 17:04:05 GMT, Andrew Haley wrote: > Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. > > Profile counters scale very badly. > > The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. > > For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: > > > Benchmark (randomized) Mode Cnt Score Error Units > InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op > InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op > > > This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. Impressive work. Clashes a bit with https://github.com/openjdk/jdk/pull/25305/, which commons the type profile check and makes it more robust. It would be trivial to resolve, as _that_ PR has only one place where counter is updated. Also gives you some additional budget to spare for more instructions in profiled code. So it would be nice if that PR (and probably its AArch64 version) lands first. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3586832242 From aph at openjdk.org Thu Nov 27 17:22:51 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 27 Nov 2025 17:22:51 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: <6mjTtKpyy3WYVnj2UEwTH6S_od8AlyEYXsy0F__yiXE=.fc1b1e35-f80a-46fa-bd33-8f36bf65e155@github.com> On Thu, 27 Nov 2025 17:18:35 GMT, Andrew Haley wrote: >> Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. >> >> Profile counters scale very badly. >> >> The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. >> >> For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op >> InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op >> >> >> This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 52 commits: > > - Merge remote-tracking branch 'refs/remotes/origin/JDK-8134940' into JDK-8134940 > - whitespace > - AArch64 > - Minimize deltas to master > - Better > - Inter > - Cleanup > - Cleanup > - Merge master > - D'oh > - ... and 42 more: https://git.openjdk.org/jdk/compare/b2f97131...49d52d82 > Impressive work. > > Clashes a bit with #25305, which commons the type profile check and makes it more robust. It would be trivial to resolve, as _that_ PR has only one place where counter is updated. Also gives you some additional budget to spare for more instructions in profiled code. So it would be nice if that PR (and probably its AArch64 version) lands first. Thanks. Sure, it can wait for that PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3586849348 From aph at openjdk.org Thu Nov 27 17:31:51 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 27 Nov 2025 17:31:51 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 17:18:35 GMT, Andrew Haley wrote: >> Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. >> >> Profile counters scale very badly. >> >> The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. >> >> For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op >> InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op >> >> >> This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. >> >> This patch is for C1 only. It'd be easy to randomize C1 counters as well in another PR, if anyone thinks it's worth doing. > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 52 commits: > > - Merge remote-tracking branch 'refs/remotes/origin/JDK-8134940' into JDK-8134940 > - whitespace > - AArch64 > - Minimize deltas to master > - Better > - Inter > - Cleanup > - Cleanup > - Merge master > - D'oh > - ... and 42 more: https://git.openjdk.org/jdk/compare/b2f97131...49d52d82 The inlined profile update code is moved to a stub, then in its place we put: ubfx x8, rng, #26, #6 // extract the top 6 bits of the random-number generator cbz x8, update // if they are zero, jump to the stub that updates the profile counter next_random rng // generate the next random number At the moment, several C2 IR tests fail with randomized profile counters because they are acutely sensitive to small changes in profile counts. I think this can probably be fixed. Also, I believe there are some kinds of event that should never be missed, even when subsampling profile counters in this way. I'd like people to advise me which events these are. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3586876351 From aph at openjdk.org Thu Nov 27 17:34:50 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 27 Nov 2025 17:34:50 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: <2QTcG-2m_W9lkuPzfd1Diw2ItL35sc97Z6Gpjbrs_8c=.72a6b794-48a3-4583-acdc-c22217d8f87b@github.com> On Thu, 27 Nov 2025 17:18:35 GMT, Andrew Haley wrote: >> Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. >> >> Profile counters scale very badly. >> >> The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. >> >> For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op >> InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op >> >> >> This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. >> >> This patch is for C1 only. It'd be easy to randomize C1 counters as well in another PR, if anyone thinks it's worth doing. > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 52 commits: > > - Merge remote-tracking branch 'refs/remotes/origin/JDK-8134940' into JDK-8134940 > - whitespace > - AArch64 > - Minimize deltas to master > - Better > - Inter > - Cleanup > - Cleanup > - Merge master > - D'oh > - ... and 42 more: https://git.openjdk.org/jdk/compare/b2f97131...49d52d82 I have only made the back-end changes to AArch64 and x86. The back-end changes are simple to make for other architectures, and will need to be done if this PR is to be merged into mainline. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3586882842 From shade at openjdk.org Thu Nov 27 17:44:50 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Nov 2025 17:44:50 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 17:29:33 GMT, Andrew Haley wrote: > Also, I believe there are some kinds of event that should never be missed, even when subsampling profile counters in this way. I'd like people to advise me which events these are One other thing that comes into mind: the initial swing from `0` -> `1` for a type counter is important, since `0` means "never seen the type at all", and `>0` means "maybe the type is present, however rare". I would suspect subsampling a small count to `0` would cause performance anomalies. Especially if, say, this anomaly causes a deopt - reprofile - compile cycle. It would doubly hurt, if _reprofile_ would miss the type _again_. Probably hard to do with RNG, but maybe we should be doing the initial counter seed on installation without consulting RNG. I don't think current patch does it, but maybe I am looking at the wrong place. Would be fairly trivial to do after https://github.com/openjdk/jdk/pull/25305. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3586901495 From aph at openjdk.org Thu Nov 27 17:59:55 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 27 Nov 2025 17:59:55 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 17:18:35 GMT, Andrew Haley wrote: >> Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. >> >> Profile counters scale very badly. >> >> The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. >> >> For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op >> InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op >> >> >> This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. >> >> This patch is for C1 only. It'd be easy to randomize C1 counters as well in another PR, if anyone thinks it's worth doing. >> >> One other thing to note is that randomized profile counters degrade very badly with small decimation ratios. For example, using a ratio of 2 with `-XX:ProfileCaptureRatio=2` with a single thread results in >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 80.147 ? 9.991 ns/op >> >> >> The problem is that the branch prediction rate drops away very badly, leading to many mispredictions. It only really makes sense to use higher decimation ratios, e.g. 64. > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 52 commits: > > - Merge remote-tracking branch 'refs/remotes/origin/JDK-8134940' into JDK-8134940 > - whitespace > - AArch64 > - Minimize deltas to master > - Better > - Inter > - Cleanup > - Cleanup > - Merge master > - D'oh > - ... and 42 more: https://git.openjdk.org/jdk/compare/b2f97131...49d52d82 > > Also, I believe there are some kinds of event that should never be missed, even when subsampling profile counters in this way. I'd like people to advise me which events these are > > One other thing that comes into mind: the initial swing from `0` -> `1` for a type counter is important, since `0` means "never seen the type at all", and `>0` means "maybe the type is present, however rare". I would suspect subsampling a small count to `0` would cause performance anomalies. Especially if, say, this anomaly causes a deopt - reprofile - compile cycle. It would doubly hurt, if _reprofile_ would miss the type _again_. Probably hard to do with RNG, but maybe we should be doing the initial counter seed on installation without consulting RNG. I don't think current patch does it, but maybe I am looking at the wrong place. Would be fairly trivial to do after #25305. OK, all useful thoughts. I'll have a look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3586932951 From alanb at openjdk.org Thu Nov 27 18:50:47 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 27 Nov 2025 18:50:47 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 16:16:05 GMT, Chen Liang wrote: > Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. > > They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. > > We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. > > Paging @minborg who requested Optional folding for review. > > I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. src/hotspot/share/ci/ciField.cpp line 220: > 218: return false; > 219: // Explicit opt-in from system classes > 220: if (holder->trust_final_fields()) This is definitely nicer than listing specific classes. It would be nicer again once we can make this exceptions go away. src/java.base/share/classes/jdk/internal/vm/annotation/TrustFinalFields.java line 61: > 59: /// fields in classes specified by this annotation. > 60: /// > 61: /// This annotation is only recognized on privileged code and is ignored elsewhere. "privileged code" hints of protection domains, permissions or security manager. Some of the annotations are limited to classes defined by the boot loader, is it the case here too? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2569767299 PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2569764340 From liach at openjdk.org Thu Nov 27 19:01:52 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 27 Nov 2025 19:01:52 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: <1tazzYHm78XLDovV11RAQt2W-ujENi4b_frOa87Jv14=.45b6d8a1-cb76-49ac-8048-429916bc9c6c@github.com> On Thu, 27 Nov 2025 18:45:29 GMT, Alan Bateman wrote: >> Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. >> >> They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. >> >> We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. >> >> Paging @minborg who requested Optional folding for review. >> >> I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. > > src/java.base/share/classes/jdk/internal/vm/annotation/TrustFinalFields.java line 61: > >> 59: /// fields in classes specified by this annotation. >> 60: /// >> 61: /// This annotation is only recognized on privileged code and is ignored elsewhere. > > "privileged code" hints of protection domains, permissions or security manager. Some of the annotations are limited to classes defined by the boot loader, is it the case here too? I took this sentence from `@AOTSafeClassInitializer`. The term "privileged" comes from this variable in `classFileParser.cpp`: https://github.com/openjdk/jdk/blob/d94c52ccf2fed3fc66d25a34254c9b581c175fa1/src/hotspot/share/classfile/classFileParser.cpp#L1818-L1820 The other annotations have this note, which seems incorrect from the hotspot excerpt: @implNote This annotation only takes effect for fields of classes loaded by the boot loader. Annotations on fields of classes loaded outside of the boot loader are ignored. This behavior seems to be originally changed by 6964a690ed9a23d4c0692da2dfbced46e1436355, referring to an inaccessible issue. What should I do with this? Should I leave this as-is and create a separate patch to update this comment for vm.annotation annotations, or fix this first and have the separate patch fix other annotations later? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2569787223 From liach at openjdk.org Thu Nov 27 19:11:47 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 27 Nov 2025 19:11:47 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 18:47:15 GMT, Alan Bateman wrote: >> Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. >> >> They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. >> >> We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. >> >> Paging @minborg who requested Optional folding for review. >> >> I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. > > src/hotspot/share/ci/ciField.cpp line 220: > >> 218: return false; >> 219: // Explicit opt-in from system classes >> 220: if (holder->trust_final_fields()) > > This is definitely nicer than listing specific classes. It would be nicer again once we can make this exceptions go away. True, this occupies one of the 16 precious instance klass bits in runtime. I wish we can derive this from our final means final restrictions, but their setup is to permit use-sites to migrate more easily, and is harder for declaration sites to deduce if a declaration is easier to be permitted. We can consider blanket-trust when the JVM uses `--illegal-final-field-mutation=deny` without additional `--enable-final-field-mutation`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2569800720 From lmesnik at openjdk.org Thu Nov 27 20:17:58 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 27 Nov 2025 20:17:58 GMT Subject: RFR: 8372039: post_sampled_object_alloc is called while lock is handled Message-ID: The AOT allocates objects while holding lock. The jvmti events can't be posted in such case. The allocation sampling might be just temporary disabled while AOT objects are allocated. I prefer to disable jvmti events for allocation only, not for AOT globally. If there are more events should be generated during AOT initialization, we might want to preserve them and post after initialization is completed. The existing failure could be reproduced by running tests with jvmti stress agent and ZGC enabled. Like make run-test JTREG_JVMTI_STRESS_AGENT=debugger=true TEST=gc/z/TestGarbageCollectorMXBean.java Note: I prelaced NoJvmtiVMObjectAllocMark, it was not used. Also it was incorrect. The NoJvmtiEventsMark should be set even if jvmti events are not enable for this thread. Since jvmti events might be enabled just in the middle of the mark. ------------- Commit messages: - 8372039: post_sampled_object_alloc is called while lock is handled Changes: https://git.openjdk.org/jdk/pull/28544/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28544&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372039 Stats: 49 lines in 5 files changed: 8 ins; 30 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/28544.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28544/head:pull/28544 PR: https://git.openjdk.org/jdk/pull/28544 From redestad at openjdk.org Thu Nov 27 20:26:52 2025 From: redestad at openjdk.org (Claes Redestad) Date: Thu, 27 Nov 2025 20:26:52 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 17:18:35 GMT, Andrew Haley wrote: >> Please use [this link](https://github.com/openjdk/jdk/pull/28541/files?w=1) to view the files changed. >> >> Profile counters scale very badly. >> >> The overhead for profiled code isn't too bad with one thread, but as the thread count increases, things go wrong very quickly. >> >> For example, here's a benchmark from the OpenJDK test suite, run at TieredLevel 3 with one thread, then three threads: >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 27.468 ? 2.631 ns/op >> InterfaceCalls.test2ndInt5Types false avgt 4 240.010 ? 6.329 ns/op >> >> >> This slowdown is caused by high memory contention on the profile counters. Not only is this slow, but it can also lose profile counts. >> >> This patch is for C1 only. It'd be easy to randomize C1 counters as well in another PR, if anyone thinks it's worth doing. >> >> One other thing to note is that randomized profile counters degrade very badly with small decimation ratios. For example, using a ratio of 2 with `-XX:ProfileCaptureRatio=2` with a single thread results in >> >> >> Benchmark (randomized) Mode Cnt Score Error Units >> InterfaceCalls.test2ndInt5Types false avgt 4 80.147 ? 9.991 ns/op >> >> >> The problem is that the branch prediction rate drops away very badly, leading to many mispredictions. It only really makes sense to use higher decimation ratios, e.g. 64. > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 52 commits: > > - Merge remote-tracking branch 'refs/remotes/origin/JDK-8134940' into JDK-8134940 > - whitespace > - AArch64 > - Minimize deltas to master > - Better > - Inter > - Cleanup > - Cleanup > - Merge master > - D'oh > - ... and 42 more: https://git.openjdk.org/jdk/compare/b2f97131...49d52d82 Happy to see a serious contender for a resolution to this long-standing issue. While it's a bit unclear how problematic it is in practice we see issues related to this in thread-heavy benchmarks (such as SPECjvm2008) regularly. > It'd be easy to randomize C1 counters as well in another PR, if anyone thinks it's worth doing. I assume you mean interpreter counters? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3587196700 From lmesnik at openjdk.org Thu Nov 27 20:53:32 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 27 Nov 2025 20:53:32 GMT Subject: RFR: 8372039: post_sampled_object_alloc is called while lock is handled [v2] In-Reply-To: References: Message-ID: > The AOT allocates objects while holding lock. The jvmti events can't be posted in such case. The allocation sampling might be just temporary disabled while AOT objects are allocated. > > I prefer to disable jvmti events for allocation only, not for AOT globally. If there are more events should be generated during AOT initialization, we might want to preserve them and post after initialization is completed. > > The existing failure could be reproduced by running tests with jvmti stress agent and ZGC enabled. Like > make run-test JTREG_JVMTI_STRESS_AGENT=debugger=true TEST=gc/z/TestGarbageCollectorMXBean.java > > Note: > I prelaced NoJvmtiVMObjectAllocMark, it was not used. Also it was incorrect. The > NoJvmtiEventsMark should be set even if jvmti events are not enable for this thread. Since jvmti events might be enabled just in the middle of the mark. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: Added regression test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28544/files - new: https://git.openjdk.org/jdk/pull/28544/files/7662d282..67bcdf11 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28544&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28544&range=00-01 Stats: 120 lines in 2 files changed: 120 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28544.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28544/head:pull/28544 PR: https://git.openjdk.org/jdk/pull/28544 From duke at openjdk.org Thu Nov 27 23:53:11 2025 From: duke at openjdk.org (Vishal Chand) Date: Thu, 27 Nov 2025 23:53:11 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements Message-ID: This PR re-enables LocalRandom clinit after monitor pinning improvements. Enabling this one would start printing random seeds, which is useful for test debugging. ------------- Commit messages: - 8372652: Re-enable LocalRandom clinit after monitor pinning improvements Changes: https://git.openjdk.org/jdk/pull/28547/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28547&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372652 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28547.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28547/head:pull/28547 PR: https://git.openjdk.org/jdk/pull/28547 From dholmes at openjdk.org Fri Nov 28 02:14:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 28 Nov 2025 02:14:57 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v13] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: <5l0qGRwVRs_ipx7l6-lL_oQesl64Cy2D6O14lT2sHYs=.4bebc83d-77bb-412a-8442-5a8809dc2ba7@github.com> On Tue, 25 Nov 2025 10:19:38 GMT, Afshin Zafari wrote: >> The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. >> The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. >> >> Tests: >> linux-x64 tier1 > > Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add > - better type > - fix arguments.cpp for HeapMinBaseAddress type. > - Merge remote-tracking branch 'origin/master' into _8351334_ubsan_nullptr_add > - removed redundant check of overflow. > - subtraction for checking overflow > - fixed MAX2 template parameter > - fixes. > - uintptr_t -> uint64_t > - fixes > - ... and 3 more: https://git.openjdk.org/jdk/compare/43640f28...56f8b1f3 Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26955#pullrequestreview-3517272176 From thomas.stuefe at gmail.com Fri Nov 28 05:19:36 2025 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 28 Nov 2025 06:19:36 +0100 Subject: Any reason why +PrintFlagsFinal requires unlocking experimental and diagnostic flags to print their default values? In-Reply-To: <5ff2a0a5-9d61-466d-9ada-5ba911185815@oracle.com> References: <849ebf93-c3fb-45c9-92c8-21a49b3e9946@redhat.com> <5ff2a0a5-9d61-466d-9ada-5ba911185815@oracle.com> Message-ID: I am very much in favor of printing all flags, for the reasons Frederic has given. When one supports many different releases, it is a huge timesaver not to have to look up flags but see them right there in the customer logs. The ability of PrintFlagsFinal to give me all flags, including default values, after they are resolved to their final values, is also very useful during development. For simplicity, I would prefer just to change the behavior of PrintFlagsFinal to do that, but I could live with a new PrintAllFlagsFinal. Number of normal flags: 513, incl. diagnostic: 777, incl. experimental&diagnostic: 933. (Jdk 25). So, it's a bit more. I am not bothered by this, since this list never fits onto a single screen anyway. People grep. But if others prefer an extra flag, sure, let's have one. On Thu, Nov 27, 2025 at 3:05?AM David Holmes wrote: > On 27/11/2025 12:53 am, Frederic Thevenet wrote: > > Hi, > > > > Currently, using +PrintFlagsFinal prints out all JVM flags and their > > values, even if they were not modified from their default, except for > > 'locked' flags, i.e. Experimental and Diagnotic flags. In order to have > > those printed out as well, one must first 'unlock' them (with > > +UnlockExperimentalVMOptions, for instance). > > I think this was simply a pragmatic decision to avoid overwhelming the > user with information that should not be relevant. > > Now, is their a strong reason for not always displaying the default > > values for those in scenarios were there is no concerns that the output > > might be too large (that is when calling upon 'JVMFlag::printFlags' with > > 'skipDefaults' set to false, like PrintFlagsFinal does)? > > > > The reason for this question is that when chasing a bug in scenarios > > where one can only rely on logs or output provided by tools that uses > > +PrintFlagsFinal, getting the default values *in the conditions that > > those logs where produced* can be tricky as it depends on the exact > > version of the JDK that was running, and some values can be changed by > > ergonomics. > > Ouch. I think that would be a poor design choice for diagnostic, and > especially experimental flags! > Not every experimental/diagnostic flag is a boolean that defaults to false and controls an opt-in feature. We have non-boolean experimental flags and boolean flags that default to true. It is not unthinkable that those are changed during VM start. > > > If you need to know the default for experimental flags -- which given > > their nature can and do change often -- your choices are to either ask > > for these logs to be generated again using +UnlockExperimentalVMOptions > > (even if there is no intention of changing an experimental flag) or to > > go on a time consuming deep dive into the code base for the exact > > version of the JDK that was used. Neither is ideal. > > True, but for experimental flags in particular, unless you are deep > diving into the code how can you know whether a particular flag and its > value are relevant to your debugging in the first place? > I think the point here is reducing analyst strain. It's not that it is impossible to get the information otherwise, but that it's convenient and stress-reducing to have one sure way to look up all resolved flag values for a customer's JVM run. Folks who have to work with many cases involving different JVM versions would value this. BTW, we do print default values for non-experimental non-diagnostic flags, too. The same reasoning applies here: if its not changed, you could look up the default value. > That said, I don't see any harm in providing a way to print all flags, > though whether by default or by a new -XX:PrintAllFlagsFinal flag, I'm > not sure. > > Wonderful, let's do that then. Cheers, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From aboldtch at openjdk.org Fri Nov 28 07:53:48 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 28 Nov 2025 07:53:48 GMT Subject: RFR: 8371986: Remove the default value of InitialRAMPercentage In-Reply-To: References: Message-ID: <1_3HlFqVpfQCO0oVX-_1QqawQluSbeZgkr76-dKoEBI=.80984770-22f0-4d20-b984-30dafe47fc35@github.com> On Thu, 27 Nov 2025 09:53:56 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE changes the default of `InitialRAMPercentage` to 0, effectively removing its default value. Please see the CSR for specific details on this change. > > Changing the default value to 0 results in the behavior that the initial heap size (InitialHeapSize) is set to the minimum heap size (MinHeapSize). This is because of the following lines in `Arguments::set_heap_size()`, which takes the MAX value of MinHeapSize as well: > > uint64_t initial_memory = (uint64_t)(((double)MaxRAM * InitialRAMPercentage) / 100); > ... > > reasonable_initial = MAX3(reasonable_initial, reasonable_minimum, MinHeapSize); > reasonable_initial = MIN2(reasonable_initial, MaxHeapSize); > > > This change improves startup performance for all GCs, but affects the time-to-peak performance in some out-of-the-box configurations for some GCs. This is mainly visible in ParallelGC. Marked as reviewed by aboldtch (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28530#pullrequestreview-3517818503 From shade at openjdk.org Fri Nov 28 08:07:51 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 28 Nov 2025 08:07:51 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 23:33:56 GMT, Vishal Chand wrote: > This PR re-enables LocalRandom clinit after monitor pinning improvements. > Enabling this one would start printing random seeds, which is useful for test debugging. Looks fine to me. I suspect this one from @AlanBateman's Loom integration, maybe it was literally his addition? :) Also, you need to run the affected tests for extra safety. Looks like the entirety of `vmTestbase` needs to pass: `make test TEST=vmTestbase/`. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28547#pullrequestreview-3517852492 PR Comment: https://git.openjdk.org/jdk/pull/28547#issuecomment-3588304237 From alanb at openjdk.org Fri Nov 28 08:18:45 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 28 Nov 2025 08:18:45 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 08:03:16 GMT, Aleksey Shipilev wrote: > Looks fine to me. I suspect this one from @AlanBateman's Loom integration, maybe it was literally his addition? :) JEP 425 had many contributors, here's the original commit in the loom repo in 2020 from Leonid: https://github.com/openjdk/loom/commit/d6a26a5fc84c62527af2639cbcda105014fb439e The recent work for JDK 26 allows virtual threads to preempt/unmount while waiting for another thread to initialize a class but they do not allow a virtual thread to unmount while executing a class initializer (as there are VM frames on the stack). I think I would need more context to know if the change proposed here is okay. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28547#issuecomment-3588331383 From thartmann at openjdk.org Fri Nov 28 08:26:05 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 28 Nov 2025 08:26:05 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 16:48:57 GMT, Volodymyr Paprotski wrote: >> Marked as reviewed by ascarpino (Reviewer). > > Oh.. realized that I should had checked JBS.. thanks @ascarpino for resolving the bug I caused! At least its just the option.. whew. > >> @dholmes-ora Hi David, need some help with this please, don't have access to an ARM system to reproduce (or the ARM expertise).. could you point me at the failing job if thats available? Or some log if not? >> >> * Is it an issue with the options (i.e. `-XX:UseAVX=2` perhaps). I probably should had added `-XX:+IgnoreUnrecognizedVMOptions` to it.. >> * Otherwise, I am stumped.. the test case isn't architecture-specific.. it calls two methods (one of which is annotated as an intrinsic..) and expects them to return the same value.. i.e. Java and Intrinsic version should behave the same.. >> * Only thing I can think of.. The ARM implementation took some shortcuts in name of optimization. This can be entirely valid if the code calling the intrinsics never should get some specific value (-ranges). i.e. the tests RNG be further restricted.. >> * Otherwise.. is it possible its a bug in the ARM intrinsic? This caused a regression: [JDK-8372703](https://bugs.openjdk.org/browse/JDK-8372703). @vpaprotsk Could you please have a look? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3588349196 From kbarrett at openjdk.org Fri Nov 28 08:38:47 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 28 Nov 2025 08:38:47 GMT Subject: RFR: 8372650: Convert GenericWaitBarrier to use Atomic In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 10:28:26 GMT, Aleksey Shipilev wrote: >> Please review this change to GenericWaitBarrier to use Atomic rather than >> directly applying AtomicAccess to volatile members. >> >> Testing: mach5 tier1-5 > > src/hotspot/share/utilities/waitBarrier_generic.cpp line 199: > >> 197: } >> 198: } >> 199: assert(_outstanding_wakeups.load_acquire() == 0, "Post disarm: Should not have outstanding wakeups"); > > Does not have to be `_acquire`, original load is relaxed. Oops. Will fix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28527#discussion_r2570824516 From duke at openjdk.org Fri Nov 28 08:49:57 2025 From: duke at openjdk.org (Zihao Lin) Date: Fri, 28 Nov 2025 08:49:57 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: References: Message-ID: <2oDqUvcW_3hJRPRri4uttpkgfeCovL4ZZkcI0R1bB1A=.173b3a58-d0f1-4b29-94d1-77b0a350c790@github.com> On Thu, 27 Nov 2025 09:54:39 GMT, Roland Westrelin wrote: >> Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix test failed > > src/hotspot/share/opto/escape.cpp line 4488: > >> 4486: const TypePtr* adr_type = proj->adr_type(); >> 4487: const TypePtr* new_adr_type = tinst->add_offset(adr_type->offset()); >> 4488: if (adr_type != new_adr_type) { > > Can you explain that change? Did something go wrong in a merge? Here is a assert failed command: main -XX:CompileCommand=dontinline,compiler.arraycopy.TestArrayCopyMemoryChain::test* -Xbatch compiler.arraycopy.TestArrayCopyMemoryChain reason: User specified action: run main/othervm -XX:CompileCommand=dontinline,compiler.arraycopy.TestArrayCopyMemoryChain::test* -Xbatch compiler.arraycopy.TestArrayCopyMemoryChain started: Fri Nov 28 16:36:37.189 CST 2025 Mode: othervm [/othervm specified] Process id: 16782 finished: Fri Nov 28 16:36:37.350 CST 2025 elapsed time (seconds): 0.161 configuration: STDOUT: CompileCommand: dontinline compiler/arraycopy/TestArrayCopyMemoryChain.test* bool dontinline = true # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/Users/linzihao/Desktop/jdk-dev/src/hotspot/share/opto/escape.cpp:4184), pid=16782, tid=26115 # assert(result != nullptr) failed: new projection should have been allocated # # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.linzihao.jdk-dev) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.linzihao.jdk-dev, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64) # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /Users/linzihao/Desktop/jdk-dev/build/macosx-aarch64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_arraycopy_TestArrayCopyMemoryChain_java/scratch/0/hs_err_pid16782.log # # Compiler replay data is saved as: # /Users/linzihao/Desktop/jdk-dev/build/macosx-aarch64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_arraycopy_TestArrayCopyMemoryChain_java/scratch/0/replay_pid16782.log # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2570853694 From kbarrett at openjdk.org Fri Nov 28 08:55:45 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 28 Nov 2025 08:55:45 GMT Subject: RFR: 8372650: Convert GenericWaitBarrier to use Atomic [v2] In-Reply-To: References: Message-ID: > Please review this change to GenericWaitBarrier to use Atomic rather than > directly applying AtomicAccess to volatile members. > > Testing: mach5 tier1-5 Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: undo unintended load_acquire ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28527/files - new: https://git.openjdk.org/jdk/pull/28527/files/618e1a03..3577ba5e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28527&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28527&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28527.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28527/head:pull/28527 PR: https://git.openjdk.org/jdk/pull/28527 From duke at openjdk.org Fri Nov 28 08:57:53 2025 From: duke at openjdk.org (Zihao Lin) Date: Fri, 28 Nov 2025 08:57:53 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: <2oDqUvcW_3hJRPRri4uttpkgfeCovL4ZZkcI0R1bB1A=.173b3a58-d0f1-4b29-94d1-77b0a350c790@github.com> References: <2oDqUvcW_3hJRPRri4uttpkgfeCovL4ZZkcI0R1bB1A=.173b3a58-d0f1-4b29-94d1-77b0a350c790@github.com> Message-ID: <2wAnS7drj_r3dqsy5CEF9vBG40KizHsQDOxMeNymwhw=.9bc29879-eead-401c-b750-814592feff63@github.com> On Fri, 28 Nov 2025 08:47:13 GMT, Zihao Lin wrote: >> src/hotspot/share/opto/escape.cpp line 4488: >> >>> 4486: const TypePtr* adr_type = proj->adr_type(); >>> 4487: const TypePtr* new_adr_type = tinst->add_offset(adr_type->offset()); >>> 4488: if (adr_type != new_adr_type) { >> >> Can you explain that change? Did something go wrong in a merge? > > Here is a assert failed > > command: main -XX:CompileCommand=dontinline,compiler.arraycopy.TestArrayCopyMemoryChain::test* -Xbatch compiler.arraycopy.TestArrayCopyMemoryChain > reason: User specified action: run main/othervm -XX:CompileCommand=dontinline,compiler.arraycopy.TestArrayCopyMemoryChain::test* -Xbatch compiler.arraycopy.TestArrayCopyMemoryChain > started: Fri Nov 28 16:36:37.189 CST 2025 > Mode: othervm [/othervm specified] > Process id: 16782 > finished: Fri Nov 28 16:36:37.350 CST 2025 > elapsed time (seconds): 0.161 > configuration: > STDOUT: > CompileCommand: dontinline compiler/arraycopy/TestArrayCopyMemoryChain.test* bool dontinline = true > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/Users/linzihao/Desktop/jdk-dev/src/hotspot/share/opto/escape.cpp:4184), pid=16782, tid=26115 > # assert(result != nullptr) failed: new projection should have been allocated > # > # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.linzihao.jdk-dev) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.linzihao.jdk-dev, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64) > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /Users/linzihao/Desktop/jdk-dev/build/macosx-aarch64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_arraycopy_TestArrayCopyMemoryChain_java/scratch/0/hs_err_pid16782.log > # > # Compiler replay data is saved as: > # /Users/linzihao/Desktop/jdk-dev/build/macosx-aarch64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_arraycopy_TestArrayCopyMemoryChain_java/scratch/0/replay_pid16782.log > # > # If you would like to submit a bug report, please visit: > # https://bugreport.java.com/bugreport/crash.jsp > # The assert failed because `find_inst_mem()` skipped an Initialize memory projection whose `adr_type` was still the general slice, then tried to fetch the instance-specific projection from `_node_map` and got nullptr. That happens when a precise `NarrowMemProj` already exists: the code doesn?t create a new one and also never records the mapping, so later lookup fails. The fix records the mapping even if the precise `NarrowMemProj` is already present (not newly created). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2570873232 From pminborg at openjdk.org Fri Nov 28 09:03:49 2025 From: pminborg at openjdk.org (Per Minborg) Date: Fri, 28 Nov 2025 09:03:49 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 16:16:05 GMT, Chen Liang wrote: > Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. > > They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. > > We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. > > Paging @minborg who requested Optional folding for review. > > I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. I really like this one! I wonder if we could enable the new annotation `@TrustFinalFields` on package level as well so we could get rid of _all_ the special handing in `ciField.spp`. I am not sure this is the best way to do it but it would perhaps be possible to annotate the `package-info.java` file. For example in `java.lang.invoke.package-info.java`: @TrustFinalFields package java.lang.invoke; Is there a better way to do it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28540#issuecomment-3588443540 From duke at openjdk.org Fri Nov 28 09:15:45 2025 From: duke at openjdk.org (Zihao Lin) Date: Fri, 28 Nov 2025 09:15:45 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v14] In-Reply-To: References: Message-ID: > This patch remove slice parameter from LoadNode::make > > I have done more work which remove slice paramater from StoreNode::make. > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! Zihao Lin has updated the pull request incrementally with one additional commit since the last revision: remove adr_type from graphKit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24258/files - new: https://git.openjdk.org/jdk/pull/24258/files/35ec9135..18714dae Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24258&range=12-13 Stats: 62 lines in 6 files changed: 0 ins; 34 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/24258.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24258/head:pull/24258 PR: https://git.openjdk.org/jdk/pull/24258 From shade at openjdk.org Fri Nov 28 09:25:51 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 28 Nov 2025 09:25:51 GMT Subject: RFR: 8372650: Convert GenericWaitBarrier to use Atomic [v2] In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 08:55:45 GMT, Kim Barrett wrote: >> Please review this change to GenericWaitBarrier to use Atomic rather than >> directly applying AtomicAccess to volatile members. >> >> Testing: mach5 tier1-5 > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > undo unintended load_acquire Looks fine now, thanks. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28527#pullrequestreview-3518102337 From aph at openjdk.org Fri Nov 28 09:28:04 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 28 Nov 2025 09:28:04 GMT Subject: RFR: 8372701: Randomized profile counters [v2] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 20:24:09 GMT, Claes Redestad wrote: > Happy to see a serious contender for a resolution to this long-standing issue. While it's a bit unclear how problematic it is in practice we see issues related to this in thread-heavy benchmarks (such as SPECjvm2008) regularly. > > > It'd be easy to randomize C1 counters as well in another PR, if anyone thinks it's worth doing. > > I assume you mean interpreter counters? Oops. yes, of course, thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28541#issuecomment-3588526803 From duke at openjdk.org Fri Nov 28 09:59:45 2025 From: duke at openjdk.org (Vishal Chand) Date: Fri, 28 Nov 2025 09:59:45 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 08:04:44 GMT, Aleksey Shipilev wrote: >> This PR re-enables LocalRandom clinit after monitor pinning improvements. >> Enabling this one would start printing random seeds, which is useful for test debugging. > > Also, you need to run the affected tests for extra safety. Looks like the entirety of `vmTestbase` needs to pass: `make test TEST=vmTestbase/`. @shipilev `jtreg:test/hotspot/jtreg/vmTestbase` are passing locally. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28547#issuecomment-3588626610 From alanb at openjdk.org Fri Nov 28 10:06:57 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 28 Nov 2025 10:06:57 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 23:33:56 GMT, Vishal Chand wrote: > This PR re-enables LocalRandom clinit after monitor pinning improvements. > Enabling this one would start printing random seeds, which is useful for test debugging. You'll need to test with JTREG_TEST_THREAD_FACTORY=Virtual to see if there are issues. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28547#issuecomment-3588655825 From fgao at openjdk.org Fri Nov 28 10:17:50 2025 From: fgao at openjdk.org (Fei Gao) Date: Fri, 28 Nov 2025 10:17:50 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() [v2] In-Reply-To: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: > In the existing implementation, the static call stub typically emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly sequence: > > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > > The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. > > While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. > > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > > All tests in Tier1 to Tier3, under both release and debug builds, have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Patch 'isb' to 'nop' - Merge branch 'master' into reimplement-static-call-stub - 8363620: AArch64: reimplement emit_static_call_stub() In the existing implementation, the static call stub typically emits a sequence like: `isb; movk; movz; movz; movk; movz; movz; br`. This patch reimplements it using a more compact and patch-friendly sequence: ``` ldr x12, Label_data ldr x8, Label_entry br x8 Label_data: 0x00000000 0x00000000 Label_entry: 0x00000000 0x00000000 ``` The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. While emitting direct branches in static stubs for small code caches can save 2 bytes compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. Benchmark (length) Mode Cnt Master Patch Units StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op All tests in Tier1 to Tier3, under both release and debug builds, have passed. [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26638/files - new: https://git.openjdk.org/jdk/pull/26638/files/5f9285ca..f5a83e30 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26638&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26638&range=00-01 Stats: 610902 lines in 6782 files changed: 419425 ins; 121578 del; 69899 mod Patch: https://git.openjdk.org/jdk/pull/26638.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26638/head:pull/26638 PR: https://git.openjdk.org/jdk/pull/26638 From fgao at openjdk.org Fri Nov 28 10:17:51 2025 From: fgao at openjdk.org (Fei Gao) Date: Fri, 28 Nov 2025 10:17:51 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Tue, 25 Nov 2025 12:28:53 GMT, Andrew Haley wrote: >> In the existing implementation, the static call stub typically emits a sequence like: >> `isb; movk; movz; movz; movk; movz; movz; br`. >> >> This patch reimplements it using a more compact and patch-friendly sequence: >> >> ldr x12, Label_data >> ldr x8, Label_entry >> br x8 >> Label_data: >> 0x00000000 >> 0x00000000 >> Label_entry: >> 0x00000000 >> 0x00000000 >> >> The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. >> >> While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. >> >> A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. >> >> >> Benchmark (length) Mode Cnt Master Patch Units >> StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op >> StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op >> StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op >> StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op >> StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op >> StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op >> >> >> All tests in Tier1 to Tier3, under both release and debug builds, have passed. >> >> [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads > > This one has gone very quiet. @theRealAph sorry for the delay! I?ve updated the patch with a new commit. In the meantime, we also found that patching `isb` to `nop` can work the same as patching `isb` to `b .+4`. I?ve summarized my understanding in the code comments. Could you please take another look? Thank you for your time! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3588692357 From eastigeevich at openjdk.org Fri Nov 28 10:40:52 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 28 Nov 2025 10:40:52 GMT Subject: RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance [v10] In-Reply-To: References: Message-ID: > Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1. > > Neoverse-N1 implementations mitigate erratum 1542419 with a workaround: > - Disable coherent icache. > - Trap IC IVAU instructions. > - Execute: > - `tlbi vae3is, xzr` > - `dsb sy` > > `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete. > > As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests: > > "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround." > > This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions. > > Changes include: > > * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization. > * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address. > * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures. > * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk. > > Benchmarking results: Neoverse-N1 r3p1 (Graviton 2) > > - Baseline > > $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGC... Evgeny Astigeevich has updated the pull request incrementally with two additional commits since the last revision: - Add UseDeferredICacheInvalidation to defer invalidation on CPU with hardware cache coherence - Add jtreg test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28328/files - new: https://git.openjdk.org/jdk/pull/28328/files/ae3b97e8..dbeeecf1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=08-09 Stats: 337 lines in 5 files changed: 297 ins; 17 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/28328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328 PR: https://git.openjdk.org/jdk/pull/28328 From adinn at openjdk.org Fri Nov 28 10:41:08 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 28 Nov 2025 10:41:08 GMT Subject: RFR: 8372617: Save and restore stubgen stubs when using an AOT code cache Message-ID: This PR adds save and restore of all generated stubs to the AOT code cache on x86 and aarch64. Other arches are modified to deal with the related generic PAI changes. Small changes were required to the aarch64 and x86_64 generator code in order to meet two key constraints: 1. the first declared entry of every stub starts at the first instruction in the stub code range 2. all data/code cross-references from one stub to another target a declared stub entry ------------- Commit messages: - fix header declarations - move AOT address init impl into arch tree below StubRoutines - put AOT address table init code under INCLUDE_CDS - fix extras count to match number of unsafe handler regions - merge - more whitespace - rmeove whitespace - remove redundant comment - correct assert - fix typos - ... and 21 more: https://git.openjdk.org/jdk/compare/ac046628...0597bc6b Changes: https://git.openjdk.org/jdk/pull/28433/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28433&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372617 Stats: 5394 lines in 59 files changed: 4470 ins; 446 del; 478 mod Patch: https://git.openjdk.org/jdk/pull/28433.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28433/head:pull/28433 PR: https://git.openjdk.org/jdk/pull/28433 From adinn at openjdk.org Fri Nov 28 10:41:10 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 28 Nov 2025 10:41:10 GMT Subject: RFR: 8372617: Save and restore stubgen stubs when using an AOT code cache In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 14:59:02 GMT, Andrew Dinn wrote: > This PR adds save and restore of all generated stubs to the AOT code cache on x86 and aarch64. Other arches are modified to deal with the related generic PAI changes. > > Small changes were required to the aarch64 and x86_64 generator code in order to meet two key constraints: > 1. the first declared entry of every stub starts at the first instruction in the stub code range > 2. all data/code cross-references from one stub to another target a declared stub entry This PR still needs to ensure that pre-universe stub entries are registered in the AOT address table. These stubs cannot be saved and restored because the AOT cache does not exist before universe init. As a consequence their addresses cannot be registered using the normal generate time mechanism. The required fix will be to add them via special case handling as soon as the AOT cache and address table have been initialised (preferably at the point where the majority of external addresses are registered). This omission is not a problem for this patch i.e. as far as reference from existing stubs is concerned since none of the saved stubs targets to a pre-universe stub entry. However, they ought to be registered in case that situation changes or in case an nmethod that gets saved to and reloaded from the AOT cache needs at some point to target a pre-universe stub. src/hotspot/cpu/aarch64/stubDeclarations_aarch64.hpp line 142: > 140: do_arch_entry_init(aarch64, final, spin_wait, spin_wait, \ > 141: spin_wait, empty_spin_wait) \ > 142: /* stub only -- entries are not stored in StubRoutines::aarch64 */ \ Above comment is obsolete ------------- PR Comment: https://git.openjdk.org/jdk/pull/28433#issuecomment-3558574875 PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2555692076 From kvn at openjdk.org Fri Nov 28 10:41:13 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 28 Nov 2025 10:41:13 GMT Subject: RFR: 8372617: Save and restore stubgen stubs when using an AOT code cache In-Reply-To: References: Message-ID: On Thu, 20 Nov 2025 14:59:02 GMT, Andrew Dinn wrote: > This PR adds save and restore of all generated stubs to the AOT code cache on x86 and aarch64. Other arches are modified to deal with the related generic PAI changes. > > Small changes were required to the aarch64 and x86_64 generator code in order to meet two key constraints: > 1. the first declared entry of every stub starts at the first instruction in the stub code range > 2. all data/code cross-references from one stub to another target a declared stub entry I noticed that we have to add a lot of local data tables addresses for stubs (math intrinsics). May be we should consider to have platform specific AOT address tables for such address (and platform specific stubs). To avoid search in one big table. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 611: > 609: return start; > 610: } > 611: This code pattern repeats a lot. Can we move it into `load_archive_data()`? address start = load_archive_data(stub_id); if (start != nullptr) { return start; } And have an other specialized `load_archive_data` if you need `end` value. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 13115: > 13113: StubRoutines::aarch64::init_AOTAddressTable(external_addresses); > 13114: AOTCodeCache::publish_external_addresses(external_addresses); > 13115: } This and `*init_AOTAddressTable()` methods should be under `#if INCLUDE_CDS` src/hotspot/share/runtime/stubRoutines.cpp line 317: > 315: // Non-generated init method > 316: > 317: void StubRoutines::init_AOTAddressTable() { Can this body be in platform specific file. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28433#issuecomment-3560789636 PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2548099002 PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2548115066 PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2548124630 From adinn at openjdk.org Fri Nov 28 10:41:14 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 28 Nov 2025 10:41:14 GMT Subject: RFR: 8372617: Save and restore stubgen stubs when using an AOT code cache In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 00:06:22 GMT, Vladimir Kozlov wrote: >> This PR adds save and restore of all generated stubs to the AOT code cache on x86 and aarch64. Other arches are modified to deal with the related generic PAI changes. >> >> Small changes were required to the aarch64 and x86_64 generator code in order to meet two key constraints: >> 1. the first declared entry of every stub starts at the first instruction in the stub code range >> 2. all data/code cross-references from one stub to another target a declared stub entry > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 611: > >> 609: return start; >> 610: } >> 611: > > This code pattern repeats a lot. Can we move it into `load_archive_data()`? > > address start = load_archive_data(stub_id); > if (start != nullptr) { > return start; > } > > > And have an other specialized `load_archive_data` if you need `end` value. Yes, I think I can probably refactor this to make it simpler and avoid returning end. I'll push a revised version soon. Return of `end` is mostly a hangover from an older version but it does still serve a purpose (see below). Originally, it was not clear what all the other specialized cases were going to be but I thinkit is now fairly clear that we only need to cater for: - multiple entries other than the start address - multiple extra addresses - both the above. n.b. entries and extras are distinguished by one simple criterion: the former can be accessed from AOT code while the latter cannot. So, entries need to be registered with the AOT address table beneath `load_entry` while extras can be omitted (although they almost certainly need to be registered with the current runtime by the caller). That distinction ought to be written down in a comment somewhere, also the acceptable values for both: `start` clearly must be non-null. Additional entries may be `nullptr` or in the range `[start, end)`. Extra addresses may be `nullptr` or in the range `[start, end]` i.e. extras can address the first byte after the end of the stub. This is needed so that an extra can be used to mark the (exclusive) end address of a range that covers the tail of a stub. So far, it has always been the case that extra addresses occur in triples identifying UnsafeMemoryAccess handler regions, `(open, close, handler)`. I'm leaving it open as to whether any other uses might turn up which is why processing of extras happens in the caller rather than under `load_entry`. `load_entry` now knows and verifies how many entries it ought to be retrieving and accepts an `extras` array if and only if there are extra addresses in the archive. It also range checks the entry/extra addresses. So, all a caller needs to do is assign extra entries to the outputs passed by the caller then validate the extras count and process them appropriate to whatever the stub understands them to mean. `end` was originally being returned by `load_entry` to allow the caller to range check both entry and extra addresses but that check is now redundant. It is still needed because of an unnecessary complication in the current code. Addresses are converted to offsets before saving in the archive and translated back to addresses when loading. `store_entry` does ensure that all entry or extra offsets addresses lie within the range `[start, end)` and `[start, end]` respectively. However, `nullptr` and `end` both get saved to the archive as `start - end`. That means both of them get restored as `nullptr`. Clients are expected to know whether a restored `nullptr` at some offset in the extras array is really an `end` address and substitute `end` in its place (e.g. in the case where `close == nullptr` really means `close == end`). Because of that `load_entry` has to make `end` visible to the caller. If I change things so that `nullptr` is saved as UINT_MAX in that will avoid the ambiguity, all owing the client to use the value without an extra check. @vnkozlov ^^ > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 13115: > >> 13113: StubRoutines::aarch64::init_AOTAddressTable(external_addresses); >> 13114: AOTCodeCache::publish_external_addresses(external_addresses); >> 13115: } > > This and `*init_AOTAddressTable()` methods should be under `#if INCLUDE_CDS` Done > src/hotspot/share/runtime/stubRoutines.cpp line 317: > >> 315: // Non-generated init method >> 316: >> 317: void StubRoutines::init_AOTAddressTable() { > > Can this body be in platform specific file. Yes, I'll add an implementation to file stubRoutines_xxx.cpp for each platform. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2556329608 PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2556346740 PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2568620542 PR Review Comment: https://git.openjdk.org/jdk/pull/28433#discussion_r2556361072 From alanb at openjdk.org Fri Nov 28 10:55:54 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 28 Nov 2025 10:55:54 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: <1tazzYHm78XLDovV11RAQt2W-ujENi4b_frOa87Jv14=.45b6d8a1-cb76-49ac-8048-429916bc9c6c@github.com> References: <1tazzYHm78XLDovV11RAQt2W-ujENi4b_frOa87Jv14=.45b6d8a1-cb76-49ac-8048-429916bc9c6c@github.com> Message-ID: On Thu, 27 Nov 2025 18:58:59 GMT, Chen Liang wrote: >> src/java.base/share/classes/jdk/internal/vm/annotation/TrustFinalFields.java line 61: >> >>> 59: /// fields in classes specified by this annotation. >>> 60: /// >>> 61: /// This annotation is only recognized on privileged code and is ignored elsewhere. >> >> "privileged code" hints of protection domains, permissions or security manager. Some of the annotations are limited to classes defined by the boot loader, is it the case here too? > > I took this sentence from `@AOTSafeClassInitializer`. The term "privileged" comes from this variable in `classFileParser.cpp`: > https://github.com/openjdk/jdk/blob/d94c52ccf2fed3fc66d25a34254c9b581c175fa1/src/hotspot/share/classfile/classFileParser.cpp#L1818-L1820 > > The other annotations have this note, which seems incorrect from the hotspot excerpt: > > @implNote > This annotation only takes effect for fields of classes loaded by the boot > loader. Annotations on fields of classes loaded outside of the boot loader > are ignored. > > > This behavior seems to be originally changed by 6964a690ed9a23d4c0692da2dfbced46e1436355, referring to an inaccessible issue. > > What should I do with this? Should I leave this as-is and create a separate patch to update this comment for vm.annotation annotations, or fix this first and have the separate patch fix other annotations later? For this PR then you could just change the last sentence to say that the annotation is only effective for classes defined by the boot class loader or platform class loader. A follow-up PR could propose changes to the other annotation descriptions. As regards background then one of the significant changes in JDK 9 was that java.* modules could be mapped to the platform class loader without give them "all permission" in the security manager execution mode. If you see JBS issues or comments speaking of "de-privileging" then it's likely related to changes that "moved" modules that were originally mapped to the boot class loader to the platform class loader. Now that the security manager execution mode is gone then we don't have to deal with all these issues now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2571257172 From alanb at openjdk.org Fri Nov 28 10:59:48 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 28 Nov 2025 10:59:48 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 19:08:40 GMT, Chen Liang wrote: >> src/hotspot/share/ci/ciField.cpp line 220: >> >>> 218: return false; >>> 219: // Explicit opt-in from system classes >>> 220: if (holder->trust_final_fields()) >> >> This is definitely nicer than listing specific classes. It would be nicer again once we can make this exceptions go away. > > True, this occupies one of the 16 precious instance klass bits in runtime. I wish we can derive this from our final means final restrictions, but their setup is to permit use-sites to migrate more easily, and is harder for declaration sites to deduce if a declaration is easier to be permitted. We can consider blanket-trust when the JVM uses `--illegal-final-field-mutation=deny` without additional `--enable-final-field-mutation`. This would be the equivalent of running with -XX:+TrustFinalNonStaticFields, which would be nice, but there would be performance surprises as soon as you enable final field mutation for any module (and likely ALL-UNNAMED). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2571272623 From alanb at openjdk.org Fri Nov 28 11:08:56 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 28 Nov 2025 11:08:56 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: <27R9PsHG0Jn3Ov39a-G9IXvKoEG01P0mOKMfVVrF4S4=.82593db1-0844-428f-9eff-af1529ff9663@github.com> On Fri, 28 Nov 2025 09:00:46 GMT, Per Minborg wrote: > I wonder if we could enable the new annotation `@TrustFinalFields` on package level as well so we could get rid of _all_ the special handing in `ciField.spp`. I am not sure this is the best way to do it but it would perhaps be possible to annotate the `package-info.java` file. For example in `java.lang.invoke.package-info.java`: The VM don't read/parse the package-info class. It's really only used from APIs to read the annotations. In any case, teh long term goal needs to be to remove all special handling. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28540#issuecomment-3588887344 From stefank at openjdk.org Fri Nov 28 12:47:53 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 28 Nov 2025 12:47:53 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v3] In-Reply-To: <2eO9uSwDm-i18M0ixBHICBJmA2jHZaDRj6kzXwg6IfQ=.c54a4bb8-a8ad-40d1-baf7-c07b22565603@github.com> References: <2eO9uSwDm-i18M0ixBHICBJmA2jHZaDRj6kzXwg6IfQ=.c54a4bb8-a8ad-40d1-baf7-c07b22565603@github.com> Message-ID: On Wed, 26 Nov 2025 10:20:14 GMT, Axel Boldt-Christmas wrote: >> AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. >> This restriction added a lot of extra logic to the Atomic implementation because >> we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. >> >> I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. >> >> This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. >> >> _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ >> >> * Testing >> * Extended gtest / (no other users of Atomic byte with exchange exists. >> * GHA >> * Running Tier 1-5 on Oracle supported platforms > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Remove VM_Version::supports_cx8() conditions > - Add AtomicAccessXchgTest for 1 byte Looks good. I think I found one nit that you might want to clean out. test/hotspot/gtest/runtime/test_atomic.cpp line 293: > 291: > 292: TEST_VM(AtomicEnumTest, scoped_enum_64_bit) { > 293: // Check if 64-bit atomics are available on the machine. This comment belonged to the supports_cx8 check: // Check if 64-bit atomics are available on the machine. if (!VM_Version::supports_cx8()) return; and when that check has been removed the comment should probably also be removed. Suggestion: There's a similar left over below in the patch. test/hotspot/gtest/runtime/test_atomicAccess.cpp line 144: > 142: > 143: TEST_VM(AtomicAccessCmpxchgTest, int64) { > 144: // Check if 64-bit atomics are available on the machine. Suggestion: ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28498#pullrequestreview-3518855472 PR Review Comment: https://git.openjdk.org/jdk/pull/28498#discussion_r2571557166 PR Review Comment: https://git.openjdk.org/jdk/pull/28498#discussion_r2571557855 From duke at openjdk.org Fri Nov 28 12:57:47 2025 From: duke at openjdk.org (Vishal Chand) Date: Fri, 28 Nov 2025 12:57:47 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 10:04:15 GMT, Alan Bateman wrote: >> This PR re-enables LocalRandom clinit after monitor pinning improvements. >> Enabling this one would start printing random seeds, which is useful for test debugging. > > You'll need to test with JTREG_TEST_THREAD_FACTORY=Virtual to see if there are issues. @AlanBateman I'm assuming running `JTREG_TEST_THREAD_FACTORY=Virtual make test TEST=vmTestbase/` is enough? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28547#issuecomment-3589246903 From azafari at openjdk.org Fri Nov 28 13:06:03 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 28 Nov 2025 13:06:03 GMT Subject: RFR: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer [v7] In-Reply-To: References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: On Thu, 9 Oct 2025 02:52:57 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed MAX2 template parameter > > Also I think this might need to account for the changes to `Arguments::set_heap_size` being done in https://github.com/openjdk/jdk/pull/27224 Thank you again @dholmes-ora and @jdksjolen for your re-reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26955#issuecomment-3589271181 From azafari at openjdk.org Fri Nov 28 13:06:04 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 28 Nov 2025 13:06:04 GMT Subject: Integrated: 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer In-Reply-To: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> References: <3p8Po-zqSc7uti36zwqJbCeyBA-OqKDV7GfROVzvB9U=.7dfb19fc-946f-4039-90a5-8d63ee421318@github.com> Message-ID: On Wed, 27 Aug 2025 11:24:07 GMT, Afshin Zafari wrote: > The issue happens when the HeapMinBaseAddress option gets 0 as input value. Since this option is used as an address, then using 0 in pointer arithmetics is UB. > The fix is using `unitptr_t` instead of `address`/`char*`, etc. In doing that, it is found that an overflow check does not work in all cases due to checking more conditions. That overflow check is changed too. We also need to check overflow after aligning addresses and sizes of memory regions in this context. Assertions are added to check these cases. > > Tests: > linux-x64 tier1 This pull request has now been integrated. Changeset: e071afbf Author: Afshin Zafari URL: https://git.openjdk.org/jdk/commit/e071afbfe4507b6b3a306f90bb645465fdab0070 Stats: 32 lines in 3 files changed: 4 ins; 1 del; 27 mod 8351334: [ubsan] memoryReserver.cpp:552:60: runtime error: applying non-zero offset 1073741824 to null pointer Reviewed-by: aboldtch, dholmes, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/26955 From eosterlund at openjdk.org Fri Nov 28 13:12:54 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 28 Nov 2025 13:12:54 GMT Subject: RFR: 8372039: post_sampled_object_alloc is called while lock is handled [v2] In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 20:53:32 GMT, Leonid Mesnik wrote: >> The AOT allocates objects while holding lock. The jvmti events can't be posted in such case. The allocation sampling might be just temporary disabled while AOT objects are allocated. >> >> I prefer to disable jvmti events for allocation only, not for AOT globally. If there are more events should be generated during AOT initialization, we might want to preserve them and post after initialization is completed. >> >> The existing failure could be reproduced by running tests with jvmti stress agent and ZGC enabled. Like >> make run-test JTREG_JVMTI_STRESS_AGENT=debugger=true TEST=gc/z/TestGarbageCollectorMXBean.java >> >> Note: >> I prelaced NoJvmtiVMObjectAllocMark, it was not used. Also it was incorrect. The >> NoJvmtiEventsMark should be set even if jvmti events are not enable for this thread. Since jvmti events might be enabled just in the middle of the mark. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > Added regression test This is good, but we also use the _is_disable_suspend flag to disable JVMTI events for similar reasons, but only from the AOT thread. Now that we have a better more explicit boolean for disabling JVMTI events, perhaps we should stop toggling the _is_disable_suspend flag on the AOT thread, in favour of using this new more explicit mechanism. Having said that, if we do, then the disabling mechanism needs to be able to cope with nesting. If you agree, then I don't mind if we want to do that refactoring separately from this fix though. The fix looks good otherwise. ------------- Changes requested by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28544#pullrequestreview-3518962513 From jpai at openjdk.org Fri Nov 28 13:32:52 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Fri, 28 Nov 2025 13:32:52 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: <1tazzYHm78XLDovV11RAQt2W-ujENi4b_frOa87Jv14=.45b6d8a1-cb76-49ac-8048-429916bc9c6c@github.com> Message-ID: On Fri, 28 Nov 2025 10:52:42 GMT, Alan Bateman wrote: >> I took this sentence from `@AOTSafeClassInitializer`. The term "privileged" comes from this variable in `classFileParser.cpp`: >> https://github.com/openjdk/jdk/blob/d94c52ccf2fed3fc66d25a34254c9b581c175fa1/src/hotspot/share/classfile/classFileParser.cpp#L1818-L1820 >> >> The other annotations have this note, which seems incorrect from the hotspot excerpt: >> >> @implNote >> This annotation only takes effect for fields of classes loaded by the boot >> loader. Annotations on fields of classes loaded outside of the boot loader >> are ignored. >> >> >> This behavior seems to be originally changed by 6964a690ed9a23d4c0692da2dfbced46e1436355, referring to an inaccessible issue. >> >> What should I do with this? Should I leave this as-is and create a separate patch to update this comment for vm.annotation annotations, or fix this first and have the separate patch fix other annotations later? > > For this PR then you could just change the last sentence to say that the annotation is only effective for classes defined by the boot class loader or platform class loader. A follow-up PR could propose changes to the other annotation descriptions. > > As regards background then one of the significant changes in JDK 9 was that java.* modules could be mapped to the platform class loader without give them "all permission" in the security manager execution mode. If you see JBS issues or comments speaking of "de-privileging" then it's likely related to changes that "moved" modules that were originally mapped to the boot class loader to the platform class loader. Now that the security manager execution mode is gone then we don't have to deal with all these issues now. Hello Chen, should this annotation also mention what happens if a class annotated with `@TrustFinalFields` has any of its `final` fields updated? For example, `@Stable` has this to say about such unexpected updates: ...It is in general a bad idea to reset such * variables to any other value, since compiled code might have folded * an earlier stored value, and will never detect the reset value. Are there any unexpected consequences of marking a class as `@TrustFinalFields` and having a `@Stable` on any of the final fields (for example an array): @TrustedFinalFields class JDKFooBar { private final String reallyFinal; @Stable private final int reallyFinalButAlsoStable; @Stable private final long[] finalAndStableArray; } Finally, would it still be recommended that a class annotated with `@TrustFinalFields` also have a final array field annoted with `@Stable` if that array field elements are initialized to a non-default value only once? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2571701254 From jpai at openjdk.org Fri Nov 28 13:38:47 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Fri, 28 Nov 2025 13:38:47 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: <1tazzYHm78XLDovV11RAQt2W-ujENi4b_frOa87Jv14=.45b6d8a1-cb76-49ac-8048-429916bc9c6c@github.com> Message-ID: On Fri, 28 Nov 2025 13:30:33 GMT, Jaikiran Pai wrote: >> For this PR then you could just change the last sentence to say that the annotation is only effective for classes defined by the boot class loader or platform class loader. A follow-up PR could propose changes to the other annotation descriptions. >> >> As regards background then one of the significant changes in JDK 9 was that java.* modules could be mapped to the platform class loader without give them "all permission" in the security manager execution mode. If you see JBS issues or comments speaking of "de-privileging" then it's likely related to changes that "moved" modules that were originally mapped to the boot class loader to the platform class loader. Now that the security manager execution mode is gone then we don't have to deal with all these issues now. > > Hello Chen, should this annotation also mention what happens if a class annotated with `@TrustFinalFields` has any of its `final` fields updated? For example, `@Stable` has this to say about such unexpected updates: > > > ...It is in general a bad idea to reset such > * variables to any other value, since compiled code might have folded > * an earlier stored value, and will never detect the reset value. > > > Are there any unexpected consequences of marking a class as `@TrustFinalFields` and having a `@Stable` on any of the final fields (for example an array): > > > @TrustedFinalFields > class JDKFooBar { > private final String reallyFinal; > > @Stable > private final int reallyFinalButAlsoStable; > > @Stable > private final long[] finalAndStableArray; > > } > > Finally, would it still be recommended that a class annotated with `@TrustFinalFields` also have a final array field annoted with `@Stable` if that array field elements are initialized to a non-default value only once? One another question - if a class/interface is annotated with `@TargetFinalFields`, is that annotation only applicable to that specific class or would it also be applicable for any (final fields in) subclasses of that class or implementations of that interface (does the VM ignore this annotation on an interface, should it)? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2571717831 From jpai at openjdk.org Fri Nov 28 13:47:48 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Fri, 28 Nov 2025 13:47:48 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 16:16:05 GMT, Chen Liang wrote: > Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. > > They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. > > We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. > > Paging @minborg who requested Optional folding for review. > > I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. src/java.base/share/classes/jdk/internal/vm/annotation/TrustFinalFields.java line 49: > 47: /// As a result, this should be used on classes where package-wide trusting is > 48: /// not possible due to backward compatibility concerns, such as for `java.util` > 49: /// classes. Should this sentence be reworded? It's not clear what the backward compatible concerns (for `java.util` package) are. I think it might be better to leave out any backward compatibility part when explaining which classes to use this annotation on. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2571741641 From alanb at openjdk.org Fri Nov 28 14:12:48 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 28 Nov 2025 14:12:48 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 10:04:15 GMT, Alan Bateman wrote: >> This PR re-enables LocalRandom clinit after monitor pinning improvements. >> Enabling this one would start printing random seeds, which is useful for test debugging. > > You'll need to test with JTREG_TEST_THREAD_FACTORY=Virtual to see if there are issues. > @AlanBateman I'm assuming running `JTREG_TEST_THREAD_FACTORY=Virtual make test TEST=vmTestbase/` is enough? I'll defer to @lmesnik for guidance on this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28547#issuecomment-3589481936 From iwalulya at openjdk.org Fri Nov 28 14:37:56 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 28 Nov 2025 14:37:56 GMT Subject: RFR: 8372650: Convert GenericWaitBarrier to use Atomic [v2] In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 08:55:45 GMT, Kim Barrett wrote: >> Please review this change to GenericWaitBarrier to use Atomic rather than >> directly applying AtomicAccess to volatile members. >> >> Testing: mach5 tier1-5 > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > undo unintended load_acquire Looks good! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28527#pullrequestreview-3519216337 From roland at openjdk.org Fri Nov 28 14:54:52 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 28 Nov 2025 14:54:52 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: <2wAnS7drj_r3dqsy5CEF9vBG40KizHsQDOxMeNymwhw=.9bc29879-eead-401c-b750-814592feff63@github.com> References: <2oDqUvcW_3hJRPRri4uttpkgfeCovL4ZZkcI0R1bB1A=.173b3a58-d0f1-4b29-94d1-77b0a350c790@github.com> <2wAnS7drj_r3dqsy5CEF9vBG40KizHsQDOxMeNymwhw=.9bc29879-eead-401c-b750-814592feff63@github.com> Message-ID: <-1wiWF_UEvCO6xPuYvIsElBzPPQDejGahm9Xd5YszPU=.cfb41cb1-f681-4e75-8c29-2d928468f53b@github.com> On Fri, 28 Nov 2025 08:54:49 GMT, Zihao Lin wrote: >> Here is a assert failed >> >> command: main -XX:CompileCommand=dontinline,compiler.arraycopy.TestArrayCopyMemoryChain::test* -Xbatch compiler.arraycopy.TestArrayCopyMemoryChain >> reason: User specified action: run main/othervm -XX:CompileCommand=dontinline,compiler.arraycopy.TestArrayCopyMemoryChain::test* -Xbatch compiler.arraycopy.TestArrayCopyMemoryChain >> started: Fri Nov 28 16:36:37.189 CST 2025 >> Mode: othervm [/othervm specified] >> Process id: 16782 >> finished: Fri Nov 28 16:36:37.350 CST 2025 >> elapsed time (seconds): 0.161 >> configuration: >> STDOUT: >> CompileCommand: dontinline compiler/arraycopy/TestArrayCopyMemoryChain.test* bool dontinline = true >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/Users/linzihao/Desktop/jdk-dev/src/hotspot/share/opto/escape.cpp:4184), pid=16782, tid=26115 >> # assert(result != nullptr) failed: new projection should have been allocated >> # >> # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.linzihao.jdk-dev) >> # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.linzihao.jdk-dev, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /Users/linzihao/Desktop/jdk-dev/build/macosx-aarch64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_arraycopy_TestArrayCopyMemoryChain_java/scratch/0/hs_err_pid16782.log >> # >> # Compiler replay data is saved as: >> # /Users/linzihao/Desktop/jdk-dev/build/macosx-aarch64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_arraycopy_TestArrayCopyMemoryChain_java/scratch/0/replay_pid16782.log >> # >> # If you would like to submit a bug report, please visit: >> # https://bugreport.java.com/bugreport/crash.jsp >> # > > The assert failed because `find_inst_mem()` skipped an Initialize memory projection whose `adr_type` was still the general slice, then tried to fetch the instance-specific projection from `_node_map` and got nullptr. That happens when a precise `NarrowMemProj` already exists: the code doesn?t create a new one and also never records the mapping, so later lookup fails. > > The fix records the mapping even if the precise `NarrowMemProj` is already present (not newly created). I had a closer look and I think you ran into an inconsistency. Let me see if I can get it fixed as a separate change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2571905920 From liach at openjdk.org Fri Nov 28 15:08:56 2025 From: liach at openjdk.org (Chen Liang) Date: Fri, 28 Nov 2025 15:08:56 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 13:45:19 GMT, Jaikiran Pai wrote: >> Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. >> >> They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. >> >> We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. >> >> Paging @minborg who requested Optional folding for review. >> >> I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. > > src/java.base/share/classes/jdk/internal/vm/annotation/TrustFinalFields.java line 49: > >> 47: /// As a result, this should be used on classes where package-wide trusting is >> 48: /// not possible due to backward compatibility concerns, such as for `java.util` >> 49: /// classes. > > Should this sentence be reworded? It's not clear what the backward compatible concerns (for `java.util` package) are. I think it might be better to leave out any backward compatibility part when explaining which classes to use this annotation on. Existing users have been hacking java.util final fields. I think leaving out the backward compatibility part causes more trouble, because otherwise people can just blanket-approve java.util classes for trusting and break those applications. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2571941033 From liach at openjdk.org Fri Nov 28 15:08:57 2025 From: liach at openjdk.org (Chen Liang) Date: Fri, 28 Nov 2025 15:08:57 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: <1tazzYHm78XLDovV11RAQt2W-ujENi4b_frOa87Jv14=.45b6d8a1-cb76-49ac-8048-429916bc9c6c@github.com> Message-ID: On Fri, 28 Nov 2025 13:35:51 GMT, Jaikiran Pai wrote: >> Hello Chen, should this annotation also mention what happens if a class annotated with `@TrustFinalFields` has any of its `final` fields updated? For example, `@Stable` has this to say about such unexpected updates: >> >> >> ...It is in general a bad idea to reset such >> * variables to any other value, since compiled code might have folded >> * an earlier stored value, and will never detect the reset value. >> >> >> Are there any unexpected consequences of marking a class as `@TrustFinalFields` and having a `@Stable` on any of the final fields (for example an array): >> >> >> @TrustedFinalFields >> class JDKFooBar { >> private final String reallyFinal; >> >> @Stable >> private final int reallyFinalButAlsoStable; >> >> @Stable >> private final long[] finalAndStableArray; >> >> } >> >> Finally, would it still be recommended that a class annotated with `@TrustFinalFields` also have a final array field annoted with `@Stable` if that array field elements are initialized to a non-default value only once? > > One another question - if a class/interface is annotated with `@TargetFinalFields`, is that annotation only applicable to that specific class or would it also be applicable for any (final fields in) subclasses of that class or implementations of that interface (does the VM ignore this annotation on an interface, should it)? I don't think we should mention anything about updating final fields. If you use this field, you intend the fields not to get subsequently updated. Promising the behavior in this case only introduces more trouble and is meaningless for this annotation's readers. For inheritance, we can add a word or two. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2571939030 From cnorrbin at openjdk.org Fri Nov 28 15:11:00 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Fri, 28 Nov 2025 15:11:00 GMT Subject: RFR: 8372615: Many container tests fail when running rootless on cgroup v1 Message-ID: Hi everyone, Many container tests verify that various resource limits work as expected. However, when running containers in rootless mode on both Docker and Podman with cgroup v1, resource limits are not supported. This causes tests to fail with error messages like: `Resource limits are not supported and ignored on cgroups V1 rootless systems`. To address this, we should skip these tests when running on configurations that don't support resource limits, similar to how we already handle other unsupported configurations (e.g., missing container engine or incompatibility with a specific cgroup version or container runtime). To check for this, we now need to use `Metrics.systemMetrics().getProvider()` from `jdk.internal.platform.Metrics` to detect cgroup v1. I've added this functionality to `DockerTestUtils`, which is already used by all container tests. As a result, all container tests now need to include the `java.base/jdk.internal.platform` module, even if they don't directly test resource limits. Testing: * Oracle tiers 1-5 * Local testing: - `hotspot/jtreg/containers/` - `jdk/jdk/internal/platform/docker/` on cgroup v1/v2 with Podman and Docker in both rootful and rootless configurations ------------- Commit messages: - resource limit availability checks on container tests Changes: https://git.openjdk.org/jdk/pull/28557/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28557&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372615 Stats: 63 lines in 21 files changed: 32 ins; 5 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/28557.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28557/head:pull/28557 PR: https://git.openjdk.org/jdk/pull/28557 From aph at openjdk.org Fri Nov 28 15:25:01 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 28 Nov 2025 15:25:01 GMT Subject: RFR: 8357258: x86: Improve receiver type profiling reliability [v5] In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 15:55:38 GMT, Aleksey Shipilev wrote: >> See the bug for discussion what issues current machinery has. >> >> This PR executes the plan outlined in the bug: >> 1. Common the receiver type profiling code in interpreter and C1 >> 2. Rewrite receiver type profiling code to only do atomic receiver slot installations >> 3. Trim `C1OptimizeVirtualCallProfiling` to only claim slots when receiver is installed >> >> This PR does _not_ do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler/` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Merge branch 'master' into JDK-8357258-x86-c1-optimize-virt-calls > - Tighten up some more > - Offset is always rscratch1, no need to save it > - Grossly simplify register shuffling > - More asserts > - More comment touchups > - Inline code comments > - Mention the updater in ReceiverTypeData > - type_profile -> profile_receiver_type > - Stylistic: remove redundant assert > - ... and 5 more: https://git.openjdk.org/jdk/compare/c028369d...c441209a I'm seeing minor performance regressions in `InterfaceCalls.test2ndInt5Types`, before and after this PR: Mainline: Benchmark (randomized) Mode Cnt Score Error Units InterfaceCalls.test2ndInt5Types false avgt 4 28.185 ? 0.538 ns/op InterfaceCalls.test2ndInt5Types:IPC false avgt 2.232 insns/clk InterfaceCalls.test2ndInt5Types:branch-misses:u false avgt 0.342 #/op InterfaceCalls.test2ndInt5Types:instructions:u false avgt 206.028 #/op This PR: Benchmark (randomized) Mode Cnt Score Error Units InterfaceCalls.test2ndInt5Types false avgt 4 32.247 ? 0.109 ns/op InterfaceCalls.test2ndInt5Types:IPC false avgt 2.231 insns/clk InterfaceCalls.test2ndInt5Types:branch-misses:u false avgt 0.561 #/op InterfaceCalls.test2ndInt5Types:instructions:u false avgt 238.324 #/op model name : Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz java -XX:+UnlockExperimentalVMOptions -XX:ProfileCaptureRatio=1 -jar /home/aph/theRealAph-jdk/build/linux-x86_64-server-release/images/test/micro/benchmarks.jar test2ndInt5Types -p randomized=false -f 1 -jvmArgs ' -XX:TieredStopAtLevel=3' -t 1 -prof perfnorm ------------- PR Comment: https://git.openjdk.org/jdk/pull/25305#issuecomment-3589699041 From aph at openjdk.org Fri Nov 28 15:31:52 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 28 Nov 2025 15:31:52 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Fri, 28 Nov 2025 10:13:56 GMT, Fei Gao wrote: > I?ve updated the patch with a new commit. In the meantime, we also found that patching `isb` to `nop` can work the same as patching `isb` to `b .+4`. Please don't. The slow path via an indirect jump to the static call stub is not important. It makes no sense to use the Arm memory model in such a subtle way merely to avoid a perfectly-predicted rare branch. The time taken to understand such trickery by reviewers and future maintainers is not justified by the minuscule performance gain. "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian Kernighan ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3589721655 From aboldtch at openjdk.org Fri Nov 28 15:33:19 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 28 Nov 2025 15:33:19 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v4] In-Reply-To: References: Message-ID: > AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. > This restriction added a lot of extra logic to the Atomic implementation because > we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. > > I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. > > This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. > > _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ > > * Testing > * Extended gtest / (no other users of Atomic byte with exchange exists. > * GHA > * Running Tier 1-5 on Oracle supported platforms Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Update test/hotspot/gtest/runtime/test_atomicAccess.cpp Co-authored-by: Stefan Karlsson - Update test/hotspot/gtest/runtime/test_atomic.cpp Co-authored-by: Stefan Karlsson ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28498/files - new: https://git.openjdk.org/jdk/pull/28498/files/51a1c84d..9a0e77b4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28498&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28498&range=02-03 Stats: 2 lines in 2 files changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28498.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28498/head:pull/28498 PR: https://git.openjdk.org/jdk/pull/28498 From vpaprotski at openjdk.org Fri Nov 28 16:30:59 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Fri, 28 Nov 2025 16:30:59 GMT Subject: RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements [v3] In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 08:23:27 GMT, Tobias Hartmann wrote: >> Oh.. realized that I should had checked JBS.. thanks @ascarpino for resolving the bug I caused! At least its just the option.. whew. >> >>> @dholmes-ora Hi David, need some help with this please, don't have access to an ARM system to reproduce (or the ARM expertise).. could you point me at the failing job if thats available? Or some log if not? >>> >>> * Is it an issue with the options (i.e. `-XX:UseAVX=2` perhaps). I probably should had added `-XX:+IgnoreUnrecognizedVMOptions` to it.. >>> * Otherwise, I am stumped.. the test case isn't architecture-specific.. it calls two methods (one of which is annotated as an intrinsic..) and expects them to return the same value.. i.e. Java and Intrinsic version should behave the same.. >>> * Only thing I can think of.. The ARM implementation took some shortcuts in name of optimization. This can be entirely valid if the code calling the intrinsics never should get some specific value (-ranges). i.e. the tests RNG be further restricted.. >>> * Otherwise.. is it possible its a bug in the ARM intrinsic? > > This caused a regression: [JDK-8372703](https://bugs.openjdk.org/browse/JDK-8372703). @vpaprotsk Could you please have a look? Thanks. @TobiHartmann looking! - Havent been able to reproduce yet (and folks with machine access I need are away today, US holiday) - From the first glance, the error is about code size (and this intrinsic is indeed large..). But that shouldnt be platform-dependent, iirc.. except I see `enum platform_dependent_constants` is no longer just a simple static sum of ints.. hmm. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28136#issuecomment-3589866931 From fgao at openjdk.org Fri Nov 28 18:01:46 2025 From: fgao at openjdk.org (Fei Gao) Date: Fri, 28 Nov 2025 18:01:46 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Fri, 28 Nov 2025 15:29:10 GMT, Andrew Haley wrote: > > I?ve updated the patch with a new commit. In the meantime, we also found that patching `isb` to `nop` can work the same as patching `isb` to `b .+4`. > > Please don't. The slow path via an indirect jump to the static call stub is not important. It makes no sense to use the Arm memory model in such a subtle way merely to avoid a perfectly-predicted rare branch. The time taken to understand such trickery by reviewers and future maintainers is not justified by the minuscule performance gain. > > "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian Kernighan @theRealAph Thanks for your valuable input! I?m trying to clarify how I understand your idea of patching `isb` to `b .+4`. I agree that the correctness of removing the `isb` in the stub is quite subtle, and the change to `set_destination_mt_safe` is rather obscure. Because of this, I initially assumed that the motivation for patching `isb` to `b .+4` was to avoid introducing a tricky change to `set_destination_mt_safe`, thereby keeping the PR self-contained. Based on that understanding, I thought: if the main reason for patching `isb` to `b .+4` is to avoid additional complexity, and patching `isb` to `nop` can provide the same effect, then why not patch it to `nop` instead? That?s what led me to propose the new commit. However, now I?m a bit confused, and I think I may not have fully understood your original intent behind patching `isb` to `b .+4`. In your original idea, were you suggesting a change like this: diff --git a/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp b/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp index 6fe3315014b..9bf29ff2b55 100644 --- a/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp @@ -100,6 +100,11 @@ void CompiledDirectCall::set_to_interpreted(const methodHandle& callee, address method_holder->set_data((intptr_t)callee()); MacroAssembler::pd_patch_instruction(method_holder->next_instruction_address(), entry); ICache::invalidate_range(stub, to_interp_stub_size()); + // Patch 'isb' to 'b .+4'. + CodeBuffer stub_first_instruction(stub, Assembler::instruction_size); + Assembler assembler(&stub_first_instruction); + assembler.b(assembler.pc() + 4); + // Update jump to call. set_destination_mt_safe(stub); } @@ -109,6 +114,10 @@ void CompiledDirectCall::set_stub_to_clean(static_stub_Relocation* static_stub) address stub = static_stub->addr(); assert(stub != nullptr, "stub not found"); assert(CompiledICLocker::is_safe(stub), "mt unsafe call"); + // Patch 'b .+4' to 'isb'. + CodeBuffer stub_first_instruction(stub, Assembler::instruction_size); + Assembler assembler(&stub_first_instruction); + assembler.isb(); // Creation also verifies the object. NativeMovConstReg* method_holder = nativeMovConstReg_at(stub + NativeInstruction::instruction_size); Thank you for your time! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3590049824 From kbarrett at openjdk.org Fri Nov 28 19:47:48 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 28 Nov 2025 19:47:48 GMT Subject: RFR: 8372528: Unify atomic exchange and compare exchange [v4] In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 15:33:19 GMT, Axel Boldt-Christmas wrote: >> AtomicAccess::xchg is only required to support `4` bytes and `sizeof(intptr_t)` size. >> This restriction added a lot of extra logic to the Atomic implementation because >> we have a set of features we must support (including compare exchange) for `1`, `4` and `8` byte atomics on all platforms. We have some checks for unsupported `8` byte compare exchange (`VM_Version::supports_cx8()`), but the Atomic class does not try to handle these for generating its supported functions. On such a platform we would more than likely get a linking error. >> >> I propose we change requirement for exchange to `1`, `4` and `8` bytes to achieve parity with compare exchange. Initially by implementing exchange via the `AtomicAccess::XchgUsingCmpxch`. And have follow up RFEs for each applicable platform where we specialize `AtomicAccess::PlatformXchg<1>`. >> >> This enhancement both simplifies the Atomic implementation and provides exchange capabilities for types like `bool` and enums represented by a byte. >> >> _It is a little unclear how we deal with `VM_Version::supports_cx8()`. Its existence makes it impossible to use `compare_exchange` on `int64_t` in general code. Currently the `Atomic` implementation assumes that `exchange` can always be used on `8` byte integers (at least going by the gtest). Even though `AtomicAccess` only specifies `4` bytes and the platform size. This PR changes this to `1`, `4` and `8` bytes. But not sure if the previous behaviour / implicit requirements is an oversight a similar property to `VM_Version::supports_cx8()` should apply here for `exchange`._ >> >> * Testing >> * Extended gtest / (no other users of Atomic byte with exchange exists. >> * GHA >> * Running Tier 1-5 on Oracle supported platforms > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Update test/hotspot/gtest/runtime/test_atomicAccess.cpp > > Co-authored-by: Stefan Karlsson > - Update test/hotspot/gtest/runtime/test_atomic.cpp > > Co-authored-by: Stefan Karlsson Still good. Thanks for removing the stale comments. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28498#pullrequestreview-3519801820 From lmesnik at openjdk.org Fri Nov 28 20:30:47 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 28 Nov 2025 20:30:47 GMT Subject: RFR: 8372652: Re-enable LocalRandom clinit after monitor pinning improvements In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 23:33:56 GMT, Vishal Chand wrote: > This PR re-enables LocalRandom clinit after monitor pinning improvements. > Enabling this one would start printing random seeds, which is useful for test debugging. `Unfortunately, there are might be failure of execution `vmTestbase/` with `JTREG_TEST_THREAD_FACTORY=Virtual`. So it enough ensure the your fix doesn't create new failures and timeouts. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28547#issuecomment-3590358830 From lmesnik at openjdk.org Fri Nov 28 20:50:32 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 28 Nov 2025 20:50:32 GMT Subject: RFR: 8372039: post_sampled_object_alloc is called while lock is handled [v3] In-Reply-To: References: Message-ID: > The AOT allocates objects while holding lock. The jvmti events can't be posted in such case. The allocation sampling might be just temporary disabled while AOT objects are allocated. > > I prefer to disable jvmti events for allocation only, not for AOT globally. If there are more events should be generated during AOT initialization, we might want to preserve them and post after initialization is completed. > > The existing failure could be reproduced by running tests with jvmti stress agent and ZGC enabled. Like > make run-test JTREG_JVMTI_STRESS_AGENT=debugger=true TEST=gc/z/TestGarbageCollectorMXBean.java > > Note: > I prelaced NoJvmtiVMObjectAllocMark, it was not used. Also it was incorrect. The > NoJvmtiEventsMark should be set even if jvmti events are not enable for this thread. Since jvmti events might be enabled just in the middle of the mark. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: update jvmti disabling in the AOT thread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28544/files - new: https://git.openjdk.org/jdk/pull/28544/files/67bcdf11..1c070dc7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28544&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28544&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28544.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28544/head:pull/28544 PR: https://git.openjdk.org/jdk/pull/28544 From lmesnik at openjdk.org Fri Nov 28 21:09:24 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 28 Nov 2025 21:09:24 GMT Subject: RFR: 8372039: post_sampled_object_alloc is called while lock is handled [v4] In-Reply-To: References: Message-ID: > The AOT allocates objects while holding lock. The jvmti events can't be posted in such case. The allocation sampling might be just temporary disabled while AOT objects are allocated. > > I prefer to disable jvmti events for allocation only, not for AOT globally. If there are more events should be generated during AOT initialization, we might want to preserve them and post after initialization is completed. > > The existing failure could be reproduced by running tests with jvmti stress agent and ZGC enabled. Like > make run-test JTREG_JVMTI_STRESS_AGENT=debugger=true TEST=gc/z/TestGarbageCollectorMXBean.java > > Note: > I prelaced NoJvmtiVMObjectAllocMark, it was not used. Also it was incorrect. The > NoJvmtiEventsMark should be set even if jvmti events are not enable for this thread. Since jvmti events might be enabled just in the middle of the mark. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: made jvmti_events_disalber as counter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28544/files - new: https://git.openjdk.org/jdk/pull/28544/files/1c070dc7..f7e869b2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28544&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28544&range=02-03 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/28544.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28544/head:pull/28544 PR: https://git.openjdk.org/jdk/pull/28544 From eosterlund at openjdk.org Fri Nov 28 22:50:48 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 28 Nov 2025 22:50:48 GMT Subject: RFR: 8372039: post_sampled_object_alloc is called while lock is handled [v4] In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 21:09:24 GMT, Leonid Mesnik wrote: >> The AOT allocates objects while holding lock. The jvmti events can't be posted in such case. The allocation sampling might be just temporary disabled while AOT objects are allocated. >> >> I prefer to disable jvmti events for allocation only, not for AOT globally. If there are more events should be generated during AOT initialization, we might want to preserve them and post after initialization is completed. >> >> The existing failure could be reproduced by running tests with jvmti stress agent and ZGC enabled. Like >> make run-test JTREG_JVMTI_STRESS_AGENT=debugger=true TEST=gc/z/TestGarbageCollectorMXBean.java >> >> Note: >> I prelaced NoJvmtiVMObjectAllocMark, it was not used. Also it was incorrect. The >> NoJvmtiEventsMark should be set even if jvmti events are not enable for this thread. Since jvmti events might be enabled just in the middle of the mark. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > made jvmti_events_disalber as counter Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28544#pullrequestreview-3520191286 From kbarrett at openjdk.org Fri Nov 28 22:53:58 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 28 Nov 2025 22:53:58 GMT Subject: RFR: 8372650: Convert GenericWaitBarrier to use Atomic [v2] In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 09:22:55 GMT, Aleksey Shipilev wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> undo unintended load_acquire > > Looks fine now, thanks. Thanks for reviews @shipilev and @walulyai ------------- PR Comment: https://git.openjdk.org/jdk/pull/28527#issuecomment-3590655767 From kbarrett at openjdk.org Fri Nov 28 22:53:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 28 Nov 2025 22:53:59 GMT Subject: Integrated: 8372650: Convert GenericWaitBarrier to use Atomic In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 08:36:41 GMT, Kim Barrett wrote: > Please review this change to GenericWaitBarrier to use Atomic rather than > directly applying AtomicAccess to volatile members. > > Testing: mach5 tier1-5 This pull request has now been integrated. Changeset: 52568bf4 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/52568bf4832b2bcc5dc547dbdf45a6a7172281fb Stats: 23 lines in 2 files changed: 1 ins; 1 del; 21 mod 8372650: Convert GenericWaitBarrier to use Atomic Reviewed-by: shade, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/28527 From jpai at openjdk.org Sat Nov 29 07:12:53 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Sat, 29 Nov 2025 07:12:53 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting In-Reply-To: References: Message-ID: On Fri, 28 Nov 2025 15:06:50 GMT, Chen Liang wrote: >> src/java.base/share/classes/jdk/internal/vm/annotation/TrustFinalFields.java line 49: >> >>> 47: /// As a result, this should be used on classes where package-wide trusting is >>> 48: /// not possible due to backward compatibility concerns, such as for `java.util` >>> 49: /// classes. >> >> Should this sentence be reworded? It's not clear what the backward compatible concerns (for `java.util` package) are. I think it might be better to leave out any backward compatibility part when explaining which classes to use this annotation on. > > Existing users have been hacking java.util final fields. I think leaving out the backward compatibility part causes more trouble, because otherwise people can just blanket-approve java.util classes for trusting and break those applications. Hello Chen, > because otherwise people can just blanket-approve java.util classes for trusting and break those applications. This is one of the reasons why I asked some of the questions that I did. We have seen several PRs in the recent past where `@Stable` annotation has been introduced in the core classes of Java SE because it aids constant folding optimizations. Most of those changes have been backed merely by JMH benchmarks. It won't be a surprise if we start seeing another round of PRs where the usage of this new `@TrustFinalFields` gets proposed to some of these classes in the JDK because it shows an improvement in some micro benchmark. It also won't be a surprise if those PRs too won't have associated regression tests. Furthermore, unlike `@Stable` which gets applied directly on the field(s) of interest, this new annotation will be applied a bit "far away" from such fields. So it will need additional review cycles to understand if this usage can impact the code functionally in any manner. Specifying the semantics of this annotation in various usage scenarios, in its javadoc, will aid in reviewing su ch changes in future, instead of having to regularly look into the JVM code to understand how this annotation behaves. Classes in `java.util` aren't special in any way. So if applications are changing the values of final fields of some of those classes, then the same would be done for other packages of Java SE APIs too. If, like you note, applying `@TrustFinalFields` on such classes is going to break applications, then it will be useful to specify what kind of breakages those will be (in a similar manner to what the `@Stable` annotation's javadoc does). Very specifically, I think adding a few sentences clarifying the following scenarios in this annotation's javadoc will be useful: - Will this annotation be honoured only on the specific class that it is applied to? Or will it be taken into consideration for final fields in subclasses too? - If this annotation gets applied on a class and if that class has some final fields which are already marked `@Stable`, what kind of implications will that have, if any? - If this annotation is marked on a class which has a `final` array field (for example `final long[] ids`), is it useful to continue placing a `@Stable` annotation on such array fields if the elements of those arrays are going to be initialized to a non-default value just once? - If after all the precautions are taken, if the final field of a class annotated with `@TrustFinalFields` does get updated to a new value, what kind of impact would it have (stating that such behaviour is unspecified and in general is a bad idea would be enough, if that's all there is to it) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2572850670 From liach at openjdk.org Sun Nov 30 05:24:43 2025 From: liach at openjdk.org (Chen Liang) Date: Sun, 30 Nov 2025 05:24:43 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting [v2] In-Reply-To: References: Message-ID: > Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. > > They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. > > We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. > > Paging @minborg who requested Optional folding for review. > > I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. Chen Liang has updated the pull request incrementally with one additional commit since the last revision: Essay ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28540/files - new: https://git.openjdk.org/jdk/pull/28540/files/f02b9da2..712dbf1c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28540&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28540&range=00-01 Stats: 150 lines in 2 files changed: 130 ins; 8 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/28540.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28540/head:pull/28540 PR: https://git.openjdk.org/jdk/pull/28540 From liach at openjdk.org Sun Nov 30 05:24:44 2025 From: liach at openjdk.org (Chen Liang) Date: Sun, 30 Nov 2025 05:24:44 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting [v2] In-Reply-To: References: Message-ID: On Sat, 29 Nov 2025 07:10:20 GMT, Jaikiran Pai wrote: >> Existing users have been hacking java.util final fields. I think leaving out the backward compatibility part causes more trouble, because otherwise people can just blanket-approve java.util classes for trusting and break those applications. > > Hello Chen, > >> because otherwise people can just blanket-approve java.util classes for trusting and break those applications. > > This is one of the reasons why I asked some of the questions that I did. We have seen several PRs in the recent past where `@Stable` annotation has been introduced in the core classes of Java SE because it aids constant folding optimizations. Most of those changes have been backed merely by JMH benchmarks. It won't be a surprise if we start seeing another round of PRs where the usage of this new `@TrustFinalFields` gets proposed to some of these classes in the JDK because it shows an improvement in some micro benchmark. It also won't be a surprise if those PRs too won't have associated regression tests. Furthermore, unlike `@Stable` which gets applied directly on the field(s) of interest, this new annotation will be applied a bit "far away" from such fields. So it will need additional review cycles to understand if this usage can impact the code functionally in any manner. Specifying the semantics of this annotation in various usage scenarios, in its javadoc, will aid in reviewing such changes in future, instead of having to regularly look into the JVM code to understand how this annotation behaves. > > Classes in `java.util` aren't special in any way. So if applications are changing the values of final fields of some of those classes, then the same would be done for other packages of Java SE APIs too. If, like you note, applying `@TrustFinalFields` on such classes is going to break applications, then it will be useful to specify what kind of breakages those will be (in a similar manner to what the `@Stable` annotation's javadoc does). > > Very specifically, I think adding a few sentences clarifying the following scenarios in this annotation's javadoc will be useful: > > - Will this annotation be honoured only on the specific class that it is applied to? Or will it be taken into consideration for final fields in subclasses too? > - If this annotation gets applied on a class and if that class has some final fields which are already marked `@Stable`, what kind of implications will that have, if any? > - If this annotation is marked on a class which has a `final` array field (for example `final long[] ids`), is it useful to continue placing a `@Stable` annotation on such array fields if the elements of those arrays are going to be initialized to a non-default value just once? > - If after all the precautions are taken, if the final field of a class... If you want an essay, I have written one - I just hope whatever bikeshedding for this essay does not affect the progress of Lazy Constant's performance demands. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2573355600 From alanb at openjdk.org Sun Nov 30 07:51:51 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 30 Nov 2025 07:51:51 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting [v2] In-Reply-To: References: Message-ID: On Sun, 30 Nov 2025 05:24:43 GMT, Chen Liang wrote: >> Currently, the hotspot compiler (as in ciField) trusts final fields in hidden classes, record classes, and selected jdk packages. Some classes in the JDK wish to be trusted, but they cannot apply package-wide opt-in due to other legacy classes in the package, such as java.util. >> >> They currently can use `@Stable` as a workaround, but this is fragile because a stable final field may hold a trusted null, zero, or false value, which is currently treated as non-constant by ciField. >> >> We should add an annotation to opt-in for a whole class, mainly for legacy packages. This would benefit greatly some of our classes already using a lot of Stable, such as java.util.Optional, whose empty instance is now constant-foldable, as demonstrated in a new IR test. >> >> Paging @minborg who requested Optional folding for review. >> >> I think we can remove redundant Stable in a few other java.util classes after this patch is integrated. I plan to do that in subsequent patches. > > Chen Liang has updated the pull request incrementally with one additional commit since the last revision: > > Essay src/java.base/share/classes/jdk/internal/vm/annotation/constant-folding.md line 1: > 1: Constant Folding in the Hotspot Compiler I assume any write-up of HotSpot constant folding should move into src/hotspot tree, maybe a block comment in one of the source files? src/java.base/share/classes/jdk/internal/vm/annotation/constant-folding.md line 106: > 104: `trustedFinal` setting. > 105: > 106: ### Make Final Mean Final I think you can drop this section for now. It's okay to reference JEP 500 but it will be annoying to have to maintain this text as there are many steps to follow this one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2573492977 PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2573493426 From alanb at openjdk.org Sun Nov 30 07:54:46 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 30 Nov 2025 07:54:46 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting [v2] In-Reply-To: References: Message-ID: On Sun, 30 Nov 2025 05:19:22 GMT, Chen Liang wrote: >> Hello Chen, >> >>> because otherwise people can just blanket-approve java.util classes for trusting and break those applications. >> >> This is one of the reasons why I asked some of the questions that I did. We have seen several PRs in the recent past where `@Stable` annotation has been introduced in the core classes of Java SE because it aids constant folding optimizations. Most of those changes have been backed merely by JMH benchmarks. It won't be a surprise if we start seeing another round of PRs where the usage of this new `@TrustFinalFields` gets proposed to some of these classes in the JDK because it shows an improvement in some micro benchmark. It also won't be a surprise if those PRs too won't have associated regression tests. Furthermore, unlike `@Stable` which gets applied directly on the field(s) of interest, this new annotation will be applied a bit "far away" from such fields. So it will need additional review cycles to understand if this usage can impact the code functionally in any manner. Specifying the semantics of this annotation in various usage scenarios, in its javadoc, will aid in reviewing such changes in future, instead of having to regularly look into the JVM code to understand how this annotation behaves. >> >> Classes in `java.util` aren't special in any way. So if applications are changing the values of final fields of some of those classes, then the same would be done for other packages of Java SE APIs too. If, like you note, applying `@TrustFinalFields` on such classes is going to break applications, then it will be useful to specify what kind of breakages those will be (in a similar manner to what the `@Stable` annotation's javadoc does). >> >> Very specifically, I think adding a few sentences clarifying the following scenarios in this annotation's javadoc will be useful: >> >> - Will this annotation be honoured only on the specific class that it is applied to? Or will it be taken into consideration for final fields in subclasses too? >> - If this annotation gets applied on a class and if that class has some final fields which are already marked `@Stable`, what kind of implications will that have, if any? >> - If this annotation is marked on a class which has a `final` array field (for example `final long[] ids`), is it useful to continue placing a `@Stable` annotation on such array fields if the elements of those arrays are going to be initialized to a non-default value just once? >> - If after all the precautions are taken, if... > > If you want an essay, I have written one - I just hope whatever bikeshedding for this essay does not affect the progress of Lazy Constant's performance demands. > * If after all the precautions are taken, if the final field of a class annotated with `@TrustFinalFields` does get updated to a new value, what kind of impact would it have (stating that such behaviour is unspecified and in general is a bad idea would be enough, if that's all there is to it) Field.set, which is probably the API that these libraries are using, already includes a warning about "unpredictable effects, including cases in which other parts of a program continue to use the original value of this field", so I think that is okay for now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2573494589 From duke at openjdk.org Sun Nov 30 08:05:52 2025 From: duke at openjdk.org (Zihao Lin) Date: Sun, 30 Nov 2025 08:05:52 GMT Subject: RFR: 8344116: C2: remove slice parameter from LoadNode::make [v13] In-Reply-To: <-1wiWF_UEvCO6xPuYvIsElBzPPQDejGahm9Xd5YszPU=.cfb41cb1-f681-4e75-8c29-2d928468f53b@github.com> References: <2oDqUvcW_3hJRPRri4uttpkgfeCovL4ZZkcI0R1bB1A=.173b3a58-d0f1-4b29-94d1-77b0a350c790@github.com> <2wAnS7drj_r3dqsy5CEF9vBG40KizHsQDOxMeNymwhw=.9bc29879-eead-401c-b750-814592feff63@github.com> <-1wiWF_UEvCO6xPuYvIsElBzPPQDejGahm9Xd5YszPU=.cfb41cb1-f681-4e75-8c29-2d928468f53b@github.com> Message-ID: On Fri, 28 Nov 2025 14:51:49 GMT, Roland Westrelin wrote: >> The assert failed because `find_inst_mem()` skipped an Initialize memory projection whose `adr_type` was still the general slice, then tried to fetch the instance-specific projection from `_node_map` and got nullptr. That happens when a precise `NarrowMemProj` already exists: the code doesn?t create a new one and also never records the mapping, so later lookup fails. >> >> The fix records the mapping even if the precise `NarrowMemProj` is already present (not newly created). > > I had a closer look and I think you ran into an inconsistency. Let me see if I can get it fixed as a separate change. Sure, it's better to separate to another change. I am not familiar this part, please pin me if you have better solution. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24258#discussion_r2573499061 From alanb at openjdk.org Sun Nov 30 08:07:46 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 30 Nov 2025 08:07:46 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting [v2] In-Reply-To: References: Message-ID: On Sun, 30 Nov 2025 07:51:51 GMT, Alan Bateman wrote: >> If you want an essay, I have written one - I just hope whatever bikeshedding for this essay does not affect the progress of Lazy Constant's performance demands. > >> * If after all the precautions are taken, if the final field of a class annotated with `@TrustFinalFields` does get updated to a new value, what kind of impact would it have (stating that such behaviour is unspecified and in general is a bad idea would be enough, if that's all there is to it) > > Field.set, which is probably the API that these libraries are using, already includes a warning about "unpredictable effects, including cases in which other parts of a program continue to use the original value of this field", so I think that is okay for now. > Existing users have been hacking java.util final fields. I think leaving out the backward compatibility part causes more trouble, because otherwise people can just blanket-approve java.util classes for trusting and break those applications. I don't think we have a lot of data on this as it doesn't lend itself to static analysis. Aside from serialization libraries, it's possible the hacking of finals is ad hoc and in random areas (someone pointed to something in Netty hacking a final field in a class in sun.nio.ch at one point). So I probably wouldn't call out java.util specifically but maybe you brought that up specifically as there are so many performance critical classes there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2573499552 From aph at openjdk.org Sun Nov 30 11:42:51 2025 From: aph at openjdk.org (Andrew Haley) Date: Sun, 30 Nov 2025 11:42:51 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() [v2] In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Fri, 28 Nov 2025 10:17:50 GMT, Fei Gao wrote: >> In the existing implementation, the static call stub typically emits a sequence like: >> `isb; movk; movz; movz; movk; movz; movz; br`. >> >> This patch reimplements it using a more compact and patch-friendly sequence: >> >> ldr x12, Label_data >> ldr x8, Label_entry >> br x8 >> Label_data: >> 0x00000000 >> 0x00000000 >> Label_entry: >> 0x00000000 >> 0x00000000 >> >> The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. >> >> While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. >> >> A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. >> >> >> Benchmark (length) Mode Cnt Master Patch Units >> StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op >> StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op >> StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op >> StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op >> StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op >> StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op >> >> >> All tests in Tier1 to Tier3, under both release and debug builds, have passed. >> >> [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads > > Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Patch 'isb' to 'nop' > - Merge branch 'master' into reimplement-static-call-stub > - 8363620: AArch64: reimplement emit_static_call_stub() > > In the existing implementation, the static call stub typically > emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly > sequence: > ``` > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > ``` > The new approach places the target addresses adjacent to the code > and loads them dynamically. This allows us to update the call > target by modifying only the data in memory, without changing any > instructions. This avoids the need for I-cache flushes or > issuing an `isb`[1], which are both relatively expensive > operations. > > While emitting direct branches in static stubs for small code > caches can save 2 bytes compared to the new implementation, > modifying those branches still requires I-cache flushes or an > `isb`. This patch unifies the code generation by emitting the > same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance > uplift of approximately 43%. > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > All tests in Tier1 to Tier3, under both release and debug builds, > have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads I think I'd do something like this. I does mean that we're executing an unnecessary jump+1 when we jump directly to the stub, but it maintains the invariant that the trampoline destination and the call destination are the same, so it does not matter how a call reaches the static call stub. I think this invariant is worth keeping. Remember that we're jumping from compiled code to the _interpreter_, which does thousands of jumps! A single extra well-predicted branch won't hurt. diff --git a/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp b/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp index 3f3b8d28408..87887bb0a25 100644 --- a/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp @@ -168,12 +168,16 @@ void CompiledDirectCall::set_to_interpreted(const methodHandle& callee, address // | B end ; // |end: ; // forall (1:X0=1 / 1:X0=3) We can't use `Assembler` to do this patching because it's not atomic. - CodeBuffer stub_first_instruction(stub, Assembler::instruction_size); - Assembler assembler(&stub_first_instruction); - assembler.nop(); + + NativeJump::insert(stub, stub + NativeJump::instruction_size); + + address trampoline_stub_addr = _call->get_trampoline(); + if (trampoline_stub_addr != nullptr) { + nativeCallTrampolineStub_at(trampoline_stub_addr)->set_destination(stub); + } // Update jump to call. - set_destination_mt_safe(stub); + _call->set_destination(stub); } void CompiledDirectCall::set_stub_to_clean(static_stub_Relocation* static_stub) { diff --git a/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp b/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp index f2003dd9b55..22e7dcc2552 100644 --- a/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp @@ -238,6 +238,14 @@ void NativeJump::set_jump_destination(address dest) { ICache::invalidate_range(instruction_address(), instruction_size); }; +// Atomic insertion of jump to target. +void NativeJump::insert(address code_pos, address target) { + intptr_t offset = target - code_pos; + uint32_t insn = 0b000101 << 26; + Instruction_aarch64::spatch((address)&insn, 25, 0, offset >> 2); + AtomicAccess::store((volatile uint32_t*)code_pos, insn); +} + //------------------------------------------------------------------- address NativeGeneralJump::jump_destination() const { ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3592480392 From fgao at openjdk.org Sun Nov 30 13:15:52 2025 From: fgao at openjdk.org (Fei Gao) Date: Sun, 30 Nov 2025 13:15:52 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() [v2] In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Sun, 30 Nov 2025 11:40:18 GMT, Andrew Haley wrote: >> Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Patch 'isb' to 'nop' >> - Merge branch 'master' into reimplement-static-call-stub >> - 8363620: AArch64: reimplement emit_static_call_stub() >> >> In the existing implementation, the static call stub typically >> emits a sequence like: >> `isb; movk; movz; movz; movk; movz; movz; br`. >> >> This patch reimplements it using a more compact and patch-friendly >> sequence: >> ``` >> ldr x12, Label_data >> ldr x8, Label_entry >> br x8 >> Label_data: >> 0x00000000 >> 0x00000000 >> Label_entry: >> 0x00000000 >> 0x00000000 >> ``` >> The new approach places the target addresses adjacent to the code >> and loads them dynamically. This allows us to update the call >> target by modifying only the data in memory, without changing any >> instructions. This avoids the need for I-cache flushes or >> issuing an `isb`[1], which are both relatively expensive >> operations. >> >> While emitting direct branches in static stubs for small code >> caches can save 2 bytes compared to the new implementation, >> modifying those branches still requires I-cache flushes or an >> `isb`. This patch unifies the code generation by emitting the >> same static stubs for both small and large code caches. >> >> A microbenchmark (StaticCallStub.java) demonstrates a performance >> uplift of approximately 43%. >> >> Benchmark (length) Mode Cnt Master Patch Units >> StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op >> StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op >> StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op >> StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op >> StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op >> StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op >> >> All tests in Tier1 to Tier3, under both release and debug builds, >> have passed. >> >> [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads > > I think I'd do something like this. I does mean that we're executing an unnecessary jump+1 when we jump directly to the stub, but it maintains the invariant that the trampoline destination and the call destination are the same, so it does not matter how a call reaches the static call stub. I think this invariant is worth keeping. > > Remember that we're jumping from compiled code to the _interpreter_, which does thousands of jumps! A single extra well-predicted branch won't hurt. > > > diff --git a/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp b/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp > index 3f3b8d28408..87887bb0a25 100644 > --- a/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp > @@ -168,12 +168,16 @@ void CompiledDirectCall::set_to_interpreted(const methodHandle& callee, address > // | B end ; > // |end: ; > // forall (1:X0=1 / 1:X0=3) > > We can't use `Assembler` to do this patching because it's not atomic. > > - CodeBuffer stub_first_instruction(stub, Assembler::instruction_size); > - Assembler assembler(&stub_first_instruction); > - assembler.nop(); > + > + NativeJump::insert(stub, stub + NativeJump::instruction_size); > + > + address trampoline_stub_addr = _call->get_trampoline(); > + if (trampoline_stub_addr != nullptr) { > + nativeCallTrampolineStub_at(trampoline_stub_addr)->set_destination(stub); > + } > > // Update jump to call. > - set_destination_mt_safe(stub); > + _call->set_destination(stub); > } > > void CompiledDirectCall::set_stub_to_clean(static_stub_Relocation* static_stub) { > diff --git a/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp b/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp > index f2003dd9b55..22e7dcc2552 100644 > --- a/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp > @@ -238,6 +238,14 @@ void NativeJump::set_jump_destination(address dest) { > ICache::invalidate_range(instruction_address(), instruction_size); > }; > > +// Atomic insertion of jump to target. > +void NativeJump::insert(address code_pos, address target) { > + intptr_t offset = target - code_pos; > + uint32_t insn = 0b000101 << 26; > + Instruction_aarch64::spatch((address)&insn, 25, 0, offset >> 2); > + AtomicAccess::store((volatile uint32_t*)code_pos, insn); > +} > + > //------------------------------------------------------------------- > > address NativeGeneralJump::jump_destination() const { @theRealAph thanks a lot for your explanation! I'll update it soon. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3592540875 From liach at openjdk.org Sun Nov 30 14:52:52 2025 From: liach at openjdk.org (Chen Liang) Date: Sun, 30 Nov 2025 14:52:52 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting [v2] In-Reply-To: References: Message-ID: On Sun, 30 Nov 2025 07:47:39 GMT, Alan Bateman wrote: >> Chen Liang has updated the pull request incrementally with one additional commit since the last revision: >> >> Essay > > src/java.base/share/classes/jdk/internal/vm/annotation/constant-folding.md line 1: > >> 1: Constant Folding in the Hotspot Compiler > > I assume any write-up of HotSpot constant folding should move into src/hotspot tree, maybe a block comment in one of the source files? I intend this to be a user-oriented guide on constant folding. I should just call it constant folding. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2573814710 From liach at openjdk.org Sun Nov 30 14:52:53 2025 From: liach at openjdk.org (Chen Liang) Date: Sun, 30 Nov 2025 14:52:53 GMT Subject: RFR: 8372696: Allow boot classes to explicitly opt-in for final field trusting [v2] In-Reply-To: References: Message-ID: On Sun, 30 Nov 2025 14:50:23 GMT, Chen Liang wrote: >> src/java.base/share/classes/jdk/internal/vm/annotation/constant-folding.md line 1: >> >>> 1: Constant Folding in the Hotspot Compiler >> >> I assume any write-up of HotSpot constant folding should move into src/hotspot tree, maybe a block comment in one of the source files? > > I intend this to be a user-oriented guide on constant folding. I should just call it constant folding. I intend this to be a user-oriented guide on constant folding. I should just call it constant folding. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28540#discussion_r2573814847 From aph at openjdk.org Sun Nov 30 17:10:53 2025 From: aph at openjdk.org (Andrew Haley) Date: Sun, 30 Nov 2025 17:10:53 GMT Subject: RFR: 8363620: AArch64: reimplement emit_static_call_stub() [v2] In-Reply-To: References: <4QaWBuyp2crNgc4QfWw3l4oUbtCxizQaTm6LndmoydQ=.c8d81eba-eab5-4b3d-b272-c958e5237601@github.com> Message-ID: On Fri, 28 Nov 2025 10:17:50 GMT, Fei Gao wrote: >> In the existing implementation, the static call stub typically emits a sequence like: >> `isb; movk; movz; movz; movk; movz; movz; br`. >> >> This patch reimplements it using a more compact and patch-friendly sequence: >> >> ldr x12, Label_data >> ldr x8, Label_entry >> br x8 >> Label_data: >> 0x00000000 >> 0x00000000 >> Label_entry: >> 0x00000000 >> 0x00000000 >> >> The new approach places the target addresses adjacent to the code and loads them dynamically. This allows us to update the call target by modifying only the data in memory, without changing any instructions. This avoids the need for I-cache flushes or issuing an `isb`[1], which are both relatively expensive operations. >> >> While emitting direct branches in static stubs for small code caches can save 2 instructions compared to the new implementation, modifying those branches still requires I-cache flushes or an `isb`. This patch unifies the code generation by emitting the same static stubs for both small and large code caches. >> >> A microbenchmark (StaticCallStub.java) demonstrates a performance uplift of approximately 43%. >> >> >> Benchmark (length) Mode Cnt Master Patch Units >> StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op >> StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op >> StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op >> StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op >> StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op >> StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op >> >> >> All tests in Tier1 to Tier3, under both release and debug builds, have passed. >> >> [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads > > Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Patch 'isb' to 'nop' > - Merge branch 'master' into reimplement-static-call-stub > - 8363620: AArch64: reimplement emit_static_call_stub() > > In the existing implementation, the static call stub typically > emits a sequence like: > `isb; movk; movz; movz; movk; movz; movz; br`. > > This patch reimplements it using a more compact and patch-friendly > sequence: > ``` > ldr x12, Label_data > ldr x8, Label_entry > br x8 > Label_data: > 0x00000000 > 0x00000000 > Label_entry: > 0x00000000 > 0x00000000 > ``` > The new approach places the target addresses adjacent to the code > and loads them dynamically. This allows us to update the call > target by modifying only the data in memory, without changing any > instructions. This avoids the need for I-cache flushes or > issuing an `isb`[1], which are both relatively expensive > operations. > > While emitting direct branches in static stubs for small code > caches can save 2 bytes compared to the new implementation, > modifying those branches still requires I-cache flushes or an > `isb`. This patch unifies the code generation by emitting the > same static stubs for both small and large code caches. > > A microbenchmark (StaticCallStub.java) demonstrates a performance > uplift of approximately 43%. > > Benchmark (length) Mode Cnt Master Patch Units > StaticCallStubFar.callCompiled 1000 avgt 5 39.346 22.474 us/op > StaticCallStubFar.callCompiled 10000 avgt 5 390.05 218.478 us/op > StaticCallStubFar.callCompiled 100000 avgt 5 3869.264 2174.001 us/op > StaticCallStubNear.callCompiled 1000 avgt 5 39.093 22.582 us/op > StaticCallStubNear.callCompiled 10000 avgt 5 387.319 217.398 us/op > StaticCallStubNear.callCompiled 100000 avgt 5 3855.825 2206.923 us/op > > All tests in Tier1 to Tier3, under both release and debug builds, > have passed. > > [1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-working-with-threads Your benchmark doesn't work. Please don't fix it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26638#issuecomment-3592855369