From kvn at openjdk.org Sun Jun 1 00:30:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 00:30:50 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sat, 31 May 2025 22:18:33 GMT, Martin Doerr wrote: > Trivial build fix for PPC64 and s390. I haven't seen more affected platforms. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2884826176 From jbechberger at openjdk.org Sun Jun 1 07:13:00 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:13:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v25] In-Reply-To: References:

Message-ID: <2nYqo0wpUrLLJV9iDRLwj5xjV06waCzu8Ma8YSAToIY=.1059ee96-77f8-47e6-8797-3f2b47783311@github.com> On Sat, 31 May 2025 10:37:29 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove debug printf > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 139: > >> 137: >> 138: // Trigger sampling while a thread is not in a safepoint, from a seperate thread >> 139: static void trigger_is_thread_in_native_stackwalking(); > > Is it sampling that is triggered? Sampling refers to the asynchronous signal received from the operating system (OS). > > You are asking for the sampler thread to process already taken JFR Sample Requests in the queue, right? Yes and I like your implied name better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2118819169 From jbechberger at openjdk.org Sun Jun 1 07:17:02 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:17:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v25] In-Reply-To: References:

Message-ID: On Sat, 31 May 2025 10:09:15 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove debug printf > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 36: > >> 34: #if defined(LINUX) >> 35: >> 36: #include "memory/padded.hpp" > > What is padded? If not, this should go. Good catch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2118820425 From jbechberger at openjdk.org Sun Jun 1 07:22:58 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:22:58 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v24] In-Reply-To: <-QiSWEqppeW60aedVbLA3WTmnba7Fry53Qr86wE2EPs=.7a6327ce-7ef0-4b1c-bc68-0421ba3fd46f@github.com> References:

<-QiSWEqppeW60aedVbLA3WTmnba7Fry53Qr86wE2EPs=.7a6327ce-7ef0-4b1c-bc68-0421ba3fd46f@github.com> Message-ID: On Fri, 30 May 2025 09:19:47 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/metadata/metadata.xml line 975: >> >>> 973: >>> 974: >>> 975: > >> I'm not a reviewer, but I just wanted to comment something I noticed. >> The JEP document says CPUTimeSampleLos'**t**', but the implementation says CPUTimeSampleLos'**s**'. Which one is correct? >> A sentence from the JEP document: >> >> Another new event,?`jdk.CPUTimeSampleLost`, is emitted when samples are lost ... > > Thanks for catching this mistake. I'll fix it this afternoon. I fixed it by changing the JEP. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2118825477 From jbechberger at openjdk.org Sun Jun 1 07:26:19 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:26:19 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Refactoring - Remove convoluted native trace logic ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/3a10d552..439763a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=24-25 Stats: 56 lines in 5 files changed: 3 ins; 27 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Sun Jun 1 13:04:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 13:04:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 13:19:48 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 558: > >> 556: jt->is_JfrRecorder_thread()) { >> 557: queue.increment_lost_samples(); >> 558: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); > > Why is this restored here? Because I shouldn't sample if the thread isn't in native state anymore. The thread is probably sampled anyway on the outgoing safepoint. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119157906 From mgronlun at openjdk.org Sun Jun 1 15:07:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:07:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: <62JxxY-xn3fwz0PnhcnIH6DOWBQUPIq_fhDD_7YrSmA=.bfbb317a-403e-4826-a3ed-c364882e821b@github.com> On Sun, 1 Jun 2025 15:01:06 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 362: > >> 360: drain_enqueued_requests(now, tl, jt, current); >> 361: #ifdef LINUX >> 362: if (tl->has_cpu_time_jfr_requests()) { > > You are having all threads traverse over this lock, even though the cpu time sampler is disabled by default. Can it be improved? Not without allocating in the signal handler ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119243238 From mgronlun at openjdk.org Sun Jun 1 15:27:06 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:27:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 15:18:52 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 574: > >> 572: >> 573: if (queue.enqueue(request)) { >> 574: tl->set_has_cpu_time_jfr_requests(true); > > This should only need to be set when enqueuing the first entry. You're right > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 581: > >> 579: >> 580: if (jt->thread_state() == _thread_in_native && >> 581: queue.size() > queue.capacity() * 2 / 3) { > > Is this logic still valid? You are only asking for a async processing depending on the load factor of the queue? Yes, so I only start the thread walking if necessary ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119248709 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119250511 From mgronlun at openjdk.org Sun Jun 1 15:35:01 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:35:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: <6Idy8j9wbNr9udYMhsW0BQmhb8dQvc_p20vCYtg5kZc=.6380eee6-bd1b-45d0-bca8-c8068e59bd36@github.com> On Sun, 1 Jun 2025 15:32:08 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 536: > >> 534: } >> 535: >> 536: volatile size_t count__ = 0; > > unused? Yes. > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 586: > >> 584: JfrCPUTimeThreadSampling::trigger_async_processing_of_cpu_time_jfr_requests(); >> 585: } else { >> 586: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); > > Was it true before and needed a reset? I could check this before setting ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119260755 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119261558 From mgronlun at openjdk.org Sun Jun 1 15:43:06 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:43:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: <66tRvhjE2LrwccsAYmRycS6QLF2KdRg-XHfk-scr-wg=.c7f269f0-301a-4da3-ae54-7f6bc7a440b1@github.com> On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 587: > 585: } > 586: > 587: bool JfrThreadLocal::acquire_cpu_time_jfr_native_lock() { It appears that the lock state 'NATIVE' is redundant; an asynchronous request for queue drainage only requires the dequeue lock state. NATIVE can be removed to simplify the lock protocol. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119268003 From shade at openjdk.org Sun Jun 1 16:14:50 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Sun, 1 Jun 2025 16:14:50 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: <31NqA7K-ur9Y9SJ5jIHiPuG4KHm_GWMyYU79aCYbAsQ=.16bca797-7a68-41fd-88f9-c9afce90a247@github.com> On Sat, 31 May 2025 22:18:33 GMT, Martin Doerr wrote: > Trivial build fix for PPC64 and s390. I haven't seen more affected platforms. AFAICS with my builds that invoke CDS `-Xshare:dump` on cross-compiled binaries, ARM32 is failing the same way. I think we need to add a case here: https://github.com/openjdk/jdk/blob/c1b5f62a8c30038d3b1a14d184535ba0642d51c9/src/hotspot/cpu/arm/templateInterpreterGenerator_arm.cpp#L175-L179 ------------- PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2885791890 From mdoerr at openjdk.org Sun Jun 1 17:11:05 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sun, 1 Jun 2025 17:11:05 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: > Trivial build fix for PPC64 and s390. Added arm32. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Add arm32 fix. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25568/files - new: https://git.openjdk.org/jdk/pull/25568/files/f5df2535..25fb16bf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25568&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25568&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25568.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25568/head:pull/25568 PR: https://git.openjdk.org/jdk/pull/25568 From mgronlun at openjdk.org Sun Jun 1 18:12:58 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 18:12:58 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 15:24:17 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 344: > >> 342: >> 343: // equals operator for JfrSampleRequest >> 344: inline bool operator==(const JfrSampleRequest& lhs, const JfrSampleRequest& rhs) { > > Can be removed. Unless you still want to try the ljf JfrSampleRequest optimization for the native ljf, which I kind of like now that I understand it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119386104 From mgronlun at openjdk.org Sun Jun 1 18:13:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 18:13:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 15:23:06 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 581: >> >>> 579: >>> 580: if (jt->thread_state() == _thread_in_native && >>> 581: queue.size() > queue.capacity() * 2 / 3) { >> >> Is this logic still valid? You are only asking for async processing assistance depending on the load factor of the queue? > > Yes, so I only start the thread walking if necessary I see. With a bounded queue as used in this solution, it can work quite nicely, that is, if the thread is actually on CPU in native, and just not waiting - if waiting (which is most likely) then pending requests could take a long time to be sent to consumers. I also understand better the optimization you tried as part of async walk in native and frames. Also quite nice, to walk from the last JfrSampleRequest and do equals to "batch" the top JFR sample requests that are the same (i,.e taken for the ljf). Maybe you can retry that again, but then you need to save the sid AND the tid to be reused for the top equal requests (you only need stacktrace.record_inner() for one request). Its a nice optimization. >> src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 362: >> >>> 360: drain_enqueued_requests(now, tl, jt, current); >>> 361: #ifdef LINUX >>> 362: if (tl->has_cpu_time_jfr_requests()) { >> >> You are having all threads traverse over this test, even though the cpu time sampler is disabled by default. Can it be improved? > > Not without allocating in the signal handler How so? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119385303 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119389715 From mgronlun at openjdk.org Sun Jun 1 18:25:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 18:25:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 18:22:10 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 250: > >> 248: } >> 249: >> 250: biased = true; > > Perhaps set on entry, and only keep the single biased = false below? Also, note you have a direct hit in line 221--222 above - it's biased = false. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119404072 From iveresov at openjdk.org Sun Jun 1 19:05:01 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 1 Jun 2025 19:05:01 GMT Subject: RFR: 8358236: [AOT] Graal crashes when trying to use persisted MDOs Message-ID: Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. Testing looks clean. ------------- Commit messages: - Null out MethodData::_failed_speculations before snapshot Changes: https://git.openjdk.org/jdk/pull/25570/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25570&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358236 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25570/head:pull/25570 PR: https://git.openjdk.org/jdk/pull/25570 From mgronlun at openjdk.org Sun Jun 1 20:38:29 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 20:38:29 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame Message-ID: Greetings, Please see the JIRA issue for a detailed description. Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). Testing: jdk_jfr, JVMTI PopFrame tests Thanks Markus ------------- Commit messages: - 8357962 Changes: https://git.openjdk.org/jdk/pull/25571/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25571&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357962 Stats: 3 lines in 3 files changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25571.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25571/head:pull/25571 PR: https://git.openjdk.org/jdk/pull/25571 From kvn at openjdk.org Sun Jun 1 21:23:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 21:23:53 GMT Subject: RFR: 8358236: [AOT] Graal crashes when trying to use persisted MDOs In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 19:01:27 GMT, Igor Veresov wrote: > Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. > Testing looks clean. Trivial. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25570#pullrequestreview-2886119546 From iveresov at openjdk.org Sun Jun 1 21:23:54 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 1 Jun 2025 21:23:54 GMT Subject: Integrated: 8358236: [AOT] Graal crashes when trying to use persisted MDOs In-Reply-To: References: Message-ID: <2VQGaTWxeSr29uU3Ih3S5kF9l70w3xwlkHNG_pVFr7U=.3279eb7c-5bf8-4df1-8405-61b1678552d5@github.com> On Sun, 1 Jun 2025 19:01:27 GMT, Igor Veresov wrote: > Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. > Testing looks clean. This pull request has now been integrated. Changeset: 85e36d79 Author: Igor Veresov URL: https://git.openjdk.org/jdk/commit/85e36d79246913abb8b85c2be719670655d619ab Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8358236: [AOT] Graal crashes when trying to use persisted MDOs Reviewed-by: kvn ------------- PR: https://git.openjdk.org/jdk/pull/25570 From dholmes at openjdk.org Mon Jun 2 02:11:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 02:11:57 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: On Fri, 30 May 2025 19:34:16 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future. >> >> The command to run all range specific micro-benchmarks is posted below. >> >> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"` >> >> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version. >> >> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs. >> >> | Input range(s) | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup | >> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: | >> | [-2^(-1022), 2^(-1022)] | 6568 | 17678 | 2.69x | >> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932 | 200897 | 1.45x | >> >> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes. > > Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision: > > Set address attributes in movapd assembly instruction function definition This change also broke most of the non-x86 platforms, due to the new intrinsic not being implemented on those platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2928415483 From amitkumar at openjdk.org Mon Jun 2 03:26:58 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 2 Jun 2025 03:26:58 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Thanks Martin, for fixing it. ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2886565212 From duke at openjdk.org Mon Jun 2 03:52:07 2025 From: duke at openjdk.org (Mohamed Issa) Date: Mon, 2 Jun 2025 03:52:07 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>

Message-ID: On Mon, 2 Jun 2025 02:08:55 GMT, David Holmes wrote: > This change also broke most of the non-x86 platforms, due to the new intrinsic not being implemented on those platforms. When you say "most of the non-x86 platforms", are you referring to the ones with processor types listed below? 1. jdk/src/hotspot/cpu/**arm** 2. jdk/src/hotspot/cpu/**ppc** 3. jdk/src/hotspot/cpu/**s390** I don't see a cbrt intrinsic implementation in the non-x86 platforms. However, the ones listed above appear to get to the _ShouldNotReachHere_ error state if a particular intrinsic isn't found in `TemplateInterpreterGenerator::generate_math_entry` (`templateInterpreterGenerator_*.cpp`). It looks like aarch64 and riscv don't take that route and would fall back to the default cbrt implementation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2928618217 From dholmes at openjdk.org Mon Jun 2 04:35:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:35:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic Just some drive-by comments mainly on your acquire/release usage. I'm not at all clear what memory accesses you are trying to coordinate with those. src/hotspot/share/jfr/jni/jfrJniMethod.cpp line 176: > 174: JfrEventSetting::set_enabled(JfrCPUTimeSampleEvent, rate > 0); > 175: JfrCPUTimeThreadSampling::set_rate(rate, autoadapt == JNI_TRUE); > 176: return JNI_TRUE; What is the point of having a boolean return type if you always return true? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 59: > 57: Thread* raw_thread = Thread::current_or_null_safe(); > 58: JavaThread* jt; > 59: if (raw_thread == nullptr || !raw_thread->is_Java_thread()) { // this can happen due to the high level of parralelism Suggestion: if (raw_thread == nullptr || !raw_thread->is_Java_thread()) { // this can happen due to the high level of parallelism src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 119: > 117: _data = new_data; > 118: _capacity = capacity; > 119: } I assume there is a lock protecting this so it happens atomically? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 122: > 120: > 121: bool JfrCPUTimeTraceQueue::is_full() const { > 122: return Atomic::load_acquire(&_head) >= _capacity; I don't see why acquire semantics would be needed here. Also how can it be > capacity? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 126: > 124: > 125: bool JfrCPUTimeTraceQueue::is_empty() const { > 126: return Atomic::load_acquire(&_head) == 0; Acquire semantics are definitely not needed here. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 130: > 128: > 129: s4 JfrCPUTimeTraceQueue::lost_samples() const { > 130: return Atomic::load_acquire(&_lost_samples); Again acquire semantics seem highly dubious here - what loads are you synchronizing with? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 139: > 137: > 138: u4 JfrCPUTimeTraceQueue::get_and_reset_lost_samples() { > 139: s4 lost_samples = Atomic::load_acquire(&_lost_samples); Again acquire semantics seem highly dubious here - what loads are you synchronizing with? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 151: > 149: set_capacity(capacity); > 150: } > 151: } Seems an odd definition - typically `ensure_capacity` will grow a data structure to ensure it has sufficient capacity, and if already larger than needed that is fine. Suggestion `change_capacity`, or more traditionally `resize`? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 237: > 235: > 236: void JfrCPUTimeThreadSampler::trigger_async_processing_of_cpu_time_jfr_requests() { > 237: Atomic::release_store(&_is_async_processing_of_cpu_time_jfr_requests_triggered, true); What prior stores are you ensuring should be visible by using release semantics here? ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2886627655 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119983062 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119983911 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120016607 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120011705 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120012200 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120014449 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120014541 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120020174 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120021034 From dholmes at openjdk.org Mon Jun 2 04:35:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:35:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v5] In-Reply-To: References: <6hGNW2D3_VuD-2WN0eTLYdEJoNu_9rPLu-dH-InGSK4=.64de8bc8-a98f-400f-a5e3-885dbd84d901@github.com>

Message-ID: <7wOUvZZtjrX3TpgT9JQLm-8qTAax6PrXtfHwMJpNX4M=.13a7c6cc-e037-4108-b392-7ff30d279c05@github.com> On Mon, 26 May 2025 06:29:03 GMT, Johannes Bechberger wrote: >> Also, is raw_thread == nullptr even possible? For the same reasons. > > `!raw_thread->is_Java_thread()` I found it during testing. What thread was it, and how did it reach this code? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119984783 From dholmes at openjdk.org Mon Jun 2 04:44:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:44:57 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com>

Message-ID: On Mon, 2 Jun 2025 03:49:42 GMT, Mohamed Issa wrote: > When you say "most of the non-x86 platforms", are you referring to the ones with processor types listed below? Yes - 3 of the 5 non-x86 platforms. > It looks like aarch64 and riscv don't take that route and would fall back to the default cbrt implementation. I was wondering why Aarch64 didn't fail. I guess the other platforms may use this to detect new intrinsics being added. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2928722575 From dholmes at openjdk.org Mon Jun 2 04:50:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:50:57 GMT Subject: RFR: 8357576: FieldInfo::_index is not initialized by the constructor In-Reply-To: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> References: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> Message-ID: On Fri, 30 May 2025 19:07:24 GMT, Matias Saavedra Silva wrote: > FieldInfo::_index is not initialized in either of the FieldInfo constructors so this patch adds initialization to both constructors. Verified with tier 1-5 tests Good and trivial, but does need copyright year update. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25554#pullrequestreview-2886701153 From kbarrett at openjdk.org Mon Jun 2 05:33:41 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 05:33:41 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept Message-ID: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Please review this change to permit the use of `noexcept` under certain circumstances in HotSpot code. http://wg21.link/n3050 Testing: JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the conversion would look like. It will need to be brought up to current mainline, possibly with modifications. This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 16-June-2025 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. ------------- Commit messages: - add noexcept Changes: https://git.openjdk.org/jdk/pull/25574/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8255082 Stats: 104 lines in 2 files changed: 104 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25574/head:pull/25574 PR: https://git.openjdk.org/jdk/pull/25574 From kbarrett at openjdk.org Mon Jun 2 05:48:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 05:48:59 GMT Subject: RFR: 8358205: Remove unused JFR array allocation code In-Reply-To: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> References: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> Message-ID: On Fri, 30 May 2025 18:10:07 GMT, Coleen Phillimore wrote: > The JFR code is using ObjArray->allocate() directly rather than going through oopFactory. In Valhalla, the oopFactory code is being changed to account for new array shapes and attributes, so all code should call that instead. Turns out this function is unused, so this change removes it. Tested with tier1-7 with a ShouldNotReachHere(), then jdk/jfr tests with the removal. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25553#pullrequestreview-2886834584 From dbriemann at openjdk.org Mon Jun 2 05:53:50 2025 From: dbriemann at openjdk.org (David Briemann) Date: Mon, 2 Jun 2025 05:53:50 GMT Subject: RFR: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. LGTM, Thank you! ------------- Marked as reviewed by dbriemann (Author). PR Review: https://git.openjdk.org/jdk/pull/25495#pullrequestreview-2886849573 From eosterlund at openjdk.org Mon Jun 2 06:27:50 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 06:27:50 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 20:33:50 GMT, Markus Gr?nlund wrote: > Greetings, > > Please see the JIRA issue for a detailed description. > > Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). > > Testing: jdk_jfr, JVMTI PopFrame tests > > Thanks > Markus Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25571#pullrequestreview-2886927096 From fyang at openjdk.org Mon Jun 2 06:41:55 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 2 Jun 2025 06:41:55 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame In-Reply-To: References: Message-ID: <1Y5-9j2Z4EIDS0Ftrkr8S-KT1MlrtB9jYwjzX72adrs=.d4f6f733-13cf-4473-b63a-c42c46beffd3@github.com> On Sun, 1 Jun 2025 20:33:50 GMT, Markus Gr?nlund wrote: > Greetings, > > Please see the JIRA issue for a detailed description. > > Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). > > Testing: jdk_jfr, JVMTI PopFrame tests > > Thanks > Markus FYI: `hotspot_serviceability` and `jdk_svc` test good on linux-riscv64 platform. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25571#issuecomment-2929043313 From dholmes at openjdk.org Mon Jun 2 07:02:52 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 07:02:52 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: <8ueGNCZGkc0fbJHYg8l2XPSG0w2DAxKf4e59ClyXhGw=.5497fc78-f598-4af4-b745-d05f7115e953@github.com> On Mon, 2 Jun 2025 05:28:17 GMT, Kim Barrett wrote: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. I approve of this change. A couple of minor tweaks to the text suggested. Thanks doc/hotspot-style.md line 1114: > 1112: > 1113: * Only the abbreviated form of `noexcept` exception specifications are > 1114: permitted. `noexcept` exception specifications with arguments are forbidden. Suggestion: * Only the argument-less form of `noexcept` exception specifications is permitted. doc/hotspot-style.md line 1131: > 1129: > 1130: The second is to allow the compiler and library code to choose different > 1131: algorithms, depending on whether a some function may throw exceptions. This is Suggestion: algorithms, depending on whether some function may throw exceptions. This is doc/hotspot-style.md line 1139: > 1137: such a function `noexcept` informs the compiler that `nullptr` is a possible > 1138: result. If an allocation function is not declared `noexcept` then the compiler > 1139: may elide that checking and handling for a using `new` expression. Suggestion: may elide that checking and handling for a `new` expression. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25574#pullrequestreview-2887010579 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120226615 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120229061 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120234324 From eosterlund at openjdk.org Mon Jun 2 07:31:54 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 07:31:54 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent In-Reply-To: References:

Message-ID: On Fri, 30 May 2025 09:40:21 GMT, Andrew Haley wrote: > > > It would surely be better if this evil were expunged from JDK 21 as well, lest it also confuse a backporter. > > > > > > Maybe a "here be dragons" warning would suffice. > > If you add the following comment above every call to `do_oop_store()` I'll approve this patch: > > `// Clobbers: r10, r11, r3` Hmm yes that feels like a good compromise. I added the comment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25483#issuecomment-2929209038 From mbaesken at openjdk.org Mon Jun 2 07:33:27 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 07:33:27 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured Message-ID: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . Those fail when the address sanitizer is configured ( --enable-asan ). The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . While at it, also same is also added for ubsan . ------------- Commit messages: - remove zgc change - JDK-8357826 Changes: https://git.openjdk.org/jdk/pull/25575/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357826 Stats: 56 lines in 12 files changed: 54 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Mon Jun 2 07:33:27 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 07:33:27 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 07:25:22 GMT, Matthias Baesken wrote: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . The change to src/hotspot/cpu/x86/gc/z/zAddress_x86.cpp was added because of zgc issues with ASAN but we will address this in another change so I remove it from here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2929201143 From rvansa at openjdk.org Mon Jun 2 07:36:51 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 07:36:51 GMT Subject: RFR: 8352075: Perf regression accessing fields [v16] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Add type cast ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/70f62460..9cba2d4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=14-15 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From jbhateja at openjdk.org Mon Jun 2 07:44:58 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Jun 2025 07:44:58 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][3]). > > Two key areas require improvement: > 1. At the Java level, vector indices generated for range validation could be reused for the subsequent gather load operation on architectures with native vector instructions like AArch64 SVE. However, the current implementation prevents compiler reuse of these index vectors due to divergent control flow, potentially impacting performance. > 2. At the compiler IR level, the additional `offset` input for `LoadVectorGather`/`LoadVectorGatherMasked` with subword types increases IR complexity and complicates backend implementation. Furthermore, generating `add` instructions before each memory access negatively impacts performance. > > This patch refactors the implementation at both the Java level and compiler mid-end to improve efficiency and maintainability across different architectures. > > Main changes: > 1. Java-side API refactoring: > - Explicitly passes generated index vectors to hotspot, eliminating duplicate index vectors for gather load instructions on > architectures like AArch64. > 2. C2 compiler IR refactoring: > - Refactors `LoadVectorGather`/`LoadVectorGatherMasked` IR for subword types by removing the memory offset input and incorporating it into the memory base `addr` at the IR level. This simplifies backend implementation, reduces add operations, and unifies the IR across all types. > 3. Backend changes: > - Streamlines X86 implementation of subword gather operations following the removal of the offset input from the IR level. > > Performance: > The performance of the relative JMH improves up to 27% on a X86 AVX512 system. Please see the data below: > > Benchmark Mode Cnt Unit SIZE Before After Gain > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 64 53682.012 52650.325 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 256 14484.252 14255.156 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 1024 3664.900 3595.615 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 4096 908.312 935.269 1.02 > GatherOperationsBenchmark.micr... Hi @XiaohongGong , Looks good to me, thanks again for this re-factor !! Best Regards, Jatin ------------- Marked as reviewed by jbhateja (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25138#pullrequestreview-2887157235 From eosterlund at openjdk.org Mon Jun 2 07:48:39 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 07:48:39 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References: Message-ID: > The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. > > My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. > > This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Add comment about clobbered registers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25483/files - new: https://git.openjdk.org/jdk/pull/25483/files/44f7e092..c9440f68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25483&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25483&range=00-01 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25483.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25483/head:pull/25483 PR: https://git.openjdk.org/jdk/pull/25483 From mbaesken at openjdk.org Mon Jun 2 08:07:38 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 08:07:38 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: TestBreakSignalThreadDump has issues with asan ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/3ad0d93a..aa796c8a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Mon Jun 2 08:07:38 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 08:07:38 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <4CZpPTh4S1qjEkxVcHZ-J8bxpkI4iTsOtX4iCG5M2Cw=.8c1f2e8e-02c1-4691-8d6f-aa362dd54932@github.com> On Mon, 2 Jun 2025 07:25:22 GMT, Matthias Baesken wrote: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . TestBreakSignalThreadDump shows this, so it does not work well with asan too stdout: []; stderr: [==12484==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2929322761 From rvansa at openjdk.org Mon Jun 2 08:14:48 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 08:14:48 GMT Subject: RFR: 8352075: Perf regression accessing fields [v17] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains two new commits since the last revision: - Add type cast - Fix static_assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/9cba2d4a..c592ea59 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=15-16 Stats: 53 lines in 4 files changed: 0 ins; 47 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From shade at openjdk.org Mon Jun 2 08:16:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 08:16:54 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Looks good, thanks! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2887258170 From mbaesken at openjdk.org Mon Jun 2 08:20:53 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 08:20:53 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2887272244 From kbarrett at openjdk.org Mon Jun 2 08:21:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 08:21:34 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: dholmes review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25574/files - new: https://git.openjdk.org/jdk/pull/25574/files/6364b3d4..e6decd1f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=00-01 Stats: 8 lines in 2 files changed: 1 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25574/head:pull/25574 PR: https://git.openjdk.org/jdk/pull/25574 From kbarrett at openjdk.org Mon Jun 2 08:21:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 08:21:34 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: <8ueGNCZGkc0fbJHYg8l2XPSG0w2DAxKf4e59ClyXhGw=.5497fc78-f598-4af4-b745-d05f7115e953@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> <8ueGNCZGkc0fbJHYg8l2XPSG0w2DAxKf4e59ClyXhGw=.5497fc78-f598-4af4-b745-d05f7115e953@github.com> Message-ID: On Mon, 2 Jun 2025 06:58:39 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> dholmes review > > doc/hotspot-style.md line 1139: > >> 1137: such a function `noexcept` informs the compiler that `nullptr` is a possible >> 1138: result. If an allocation function is not declared `noexcept` then the compiler >> 1139: may elide that checking and handling for a using `new` expression. > > Suggestion: > > may elide that checking and handling for a `new` expression. Instead changed to "may elide that checking and handling for a `new` expression calling that function." It's not _any_ `new` expression that might have stuff elided, only one that calls the not-nothrow allocation function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120385617 From kbarrett at openjdk.org Mon Jun 2 08:24:01 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 08:24:01 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 05:28:17 GMT, Kim Barrett wrote: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. I forgot to mention that of course the current code is out of conformance with this, since we're currently using `throw()` to declare allocation functions as being nothrow. Once this style guide is approved, we (probably meaning I) will need to update the code accordingly. Probably not as a big query-replace either, as I've already found one mistake. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2929385129 From shade at openjdk.org Mon Jun 2 08:25:00 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 08:25:00 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References:

Message-ID: On Mon, 2 Jun 2025 07:48:39 GMT, Erik ?sterlund wrote: >> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. >> >> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. >> >> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment about clobbered registers Well, since we are introducing the hunks near `do_oop_store`-s, and thus extending the scope of the patch. At this point, we can just inline `do_oop_store` (and maybe `do_oop_load`?), like Andrew initially suggested. This will also match what RISC-V already did: https://github.com/openjdk/jdk/commit/c5a1543ee3e68775f09ca29fb07efd9aebfdb33e ------------- PR Review: https://git.openjdk.org/jdk/pull/25483#pullrequestreview-2887283595 From mdoerr at openjdk.org Mon Jun 2 08:31:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 08:31:57 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25568#issuecomment-2929404503 From mdoerr at openjdk.org Mon Jun 2 08:31:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 08:31:57 GMT Subject: Integrated: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: <4qHafyELt_8KULAwgyl9NSO8VGsIlEAxQp7XCFCFVb8=.f57fa1e6-8b54-4f88-b052-0cfd1b0114d9@github.com> On Sat, 31 May 2025 22:18:33 GMT, Martin Doerr wrote: > Trivial build fix for PPC64 and s390. Added arm32. This pull request has now been integrated. Changeset: 40ce05d4 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/40ce05d4080a9a2b4876c21f83a184f9b8a580a2 Stats: 3 lines in 3 files changed: 3 ins; 0 del; 0 mod 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 Reviewed-by: shade, amitkumar, mbaesken, kvn ------------- PR: https://git.openjdk.org/jdk/pull/25568 From ayang at openjdk.org Mon Jun 2 08:42:02 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 2 Jun 2025 08:42:02 GMT Subject: RFR: 8358294: Remove unnecessary GenAlignment Message-ID: Simple replacement of `GenAlignment` with `SpaceAlignment`, because they always have the same value. Removing the former to reduce complexity. Test: tier1-3 ------------- Commit messages: - remove-gen-alignment Changes: https://git.openjdk.org/jdk/pull/25577/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25577&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358294 Stats: 105 lines in 16 files changed: 0 ins; 46 del; 59 mod Patch: https://git.openjdk.org/jdk/pull/25577.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25577/head:pull/25577 PR: https://git.openjdk.org/jdk/pull/25577 From jbechberger at openjdk.org Mon Jun 2 08:44:01 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 08:44:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 13:01:23 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 200: > >> 198: void sample_thread(JfrSampleRequest& request, void* ucontext, JavaThread* jt, JfrThreadLocal* tl); >> 199: >> 200: // sample all threads that are in native state (and requested to be sampled) > > We are not really "sampling", but processing their queues, no? You're correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120450563 From jwaters at openjdk.org Mon Jun 2 08:46:52 2025 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 2 Jun 2025 08:46:52 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 08:20:57 GMT, Kim Barrett wrote: > I forgot to mention that of course the current code is out of conformance with this, since we're currently using `throw()` to declare allocation functions as being nothrow. Once this style guide is approved, we (probably meaning I) will need to update the code accordingly. Probably not as a big query-replace either, as I've already found one mistake. If it's easier I can bring the original change to noexcept Pull Request back from the dead and remove the merge mistakes that leaked in from my other branch, which shouldn't really be that difficult to do. Not sure which code is potentially marked throw() wrongly though. Alternatively, we could just keep throw() alongside noexcept for code that already uses it, to avoid code churn. They do mean the same thing in C++17, after all (I was going to mention that there are papers for static exception specifications that propose reintroducing throw() back into C++ last I remembered, but realized that this likely doesn't mean much for us now, so this point can be ignored) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2929473632 From jbechberger at openjdk.org Mon Jun 2 08:47:01 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 08:47:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: <3d549Fxkhzd6v0fAVFEBOcxZ7hBKI1ZAUafLClp7Npw=.70183618-7dbf-4e05-bcc8-fd1216741c66@github.com> On Sun, 1 Jun 2025 13:05:44 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 367: > >> 365: JfrCPUTimeSampleRequest& request = queue.at(i); >> 366: JfrStackTrace stacktrace; >> 367: traceid tid = JfrThreadLocal::thread_id(thread); > > Check the tid as a function of the JfrSampleRequest, like we do in JFR Cooperative Sampling. You mean ` const traceid tid = in_continuation ? tl->vthread_id_with_epoch_update(jt) : JfrThreadLocal::jvm_thread_id(jt);`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120458307 From jbechberger at openjdk.org Mon Jun 2 08:53:02 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 08:53:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: <3d549Fxkhzd6v0fAVFEBOcxZ7hBKI1ZAUafLClp7Npw=.70183618-7dbf-4e05-bcc8-fd1216741c66@github.com> References:

<3d549Fxkhzd6v0fAVFEBOcxZ7hBKI1ZAUafLClp7Npw=.70183618-7dbf-4e05-bcc8-fd1216741c66@github.com> Message-ID: On Mon, 2 Jun 2025 08:44:01 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 367: >> >>> 365: JfrCPUTimeSampleRequest& request = queue.at(i); >>> 366: JfrStackTrace stacktrace; >>> 367: traceid tid = JfrThreadLocal::thread_id(thread); >> >> Check the tid as a function of the JfrSampleRequest, like we do in JFR Cooperative Sampling. > > You mean ` const traceid tid = in_continuation ? tl->vthread_id_with_epoch_update(jt) : JfrThreadLocal::jvm_thread_id(jt);`? I implemented this in this function now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120473792 From eosterlund at openjdk.org Mon Jun 2 08:56:56 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 08:56:56 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 13:41:44 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 558: >> >>> 556: jt->is_JfrRecorder_thread()) { >>> 557: queue.increment_lost_samples(); >>> 558: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); >> >> Why is this restored here? > > Because I shouldn't sample if the thread isn't in native state anymore. The thread is probably sampled anyway on the outgoing safepoint. But you might be right, I removed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120481274 From aboldtch at openjdk.org Mon Jun 2 08:59:29 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 2 Jun 2025 08:59:29 GMT Subject: RFR: 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value Message-ID: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> The way that ZPlatformAddressOffsetBits is implemented on riscv and ppc may result in a return value of 45. This is larger than the max supported value of 44 (because of other internal data structures). This was fixed in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) for aarch64. Before [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) the issue on manifested if one tried to select a heap larger than 16 TB (not supported), but after [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) we try to double the heap address space when running on a NUMA machine. So we may now encounter this bug for heaps larger than 8TB (which is supported). While ZPlatformAddressOffsetBits needs an overhaul. (It was written for non-generational ZGC where we had the three color bits inside the address.) The proposal is that we solve this for ppc and riscv by doing the same thing we did for aarch64 in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) ------------- Commit messages: - 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value Changes: https://git.openjdk.org/jdk/pull/25578/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25578&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358310 Stats: 10 lines in 2 files changed: 4 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25578.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25578/head:pull/25578 PR: https://git.openjdk.org/jdk/pull/25578 From jbechberger at openjdk.org Mon Jun 2 09:01:05 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 09:01:05 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 18:10:15 GMT, Markus Gr?nlund wrote: >> Not without allocating in the signal handler > > How so? Because we need to add the threads from the signal handler. So any kind of growing array or set would not work, especially if we want to remove the threads from within the signal handler again. This is certainly an area of future optimization, albeit this doesn't seem to have any measurable performance impact in my renaissance benchmark runs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120492743 From mbaesken at openjdk.org Mon Jun 2 09:03:55 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:03:55 GMT Subject: RFR: 8357155: [asan] ZGC does not work In-Reply-To: References:

Message-ID: <_4nt7X3dG4RfwD7R_no-3YCTNIUkWh0s6o4-eFQjHJw=.98f7be0d-b7ae-4a14-b4b8-459b6ed2c615@github.com> On Fri, 30 May 2025 15:00:53 GMT, Axel Boldt-Christmas wrote: > I was hoping this could work for Linux with 47/48 bit aarch64 VMA. But it is unclear how ASAN selects its mappings on such platforms. > > On 39/42 bit VMA returning `MIN2(valid_max_address_offset_bits, 44)` as I suggested in the PPC function may be a better best effort, as we are using addresses where we actually probed that reservations could be possible). Or even `MIN2(valid_max_address_offset_bits - 1, 44)`. Feel free to try it out, but I think this is otherwise an alright approach until we implement a better heap base selection strategy where we can test multiple base candidates. Thanks for the aarch64 related suggestions, unfortunately both do not work. So I change only the files for x86_64 and ppc64 . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2120500717 From jbechberger at openjdk.org Mon Jun 2 09:05:02 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 09:05:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 18:00:55 GMT, Markus Gr?nlund wrote: >> Yes, so I only start the thread walking if necessary > > I see. With a bounded queue as used in this solution, it can work quite nicely, that is, if the thread is actually on CPU in native, and just not waiting - if waiting (which is most likely) then pending requests could take a long time to be sent to consumers. > > I also understand better the optimization you tried as part of async walk in native and frames. Also quite nice, to walk from the last JfrSampleRequest and do equals to "batch" the top JFR sample requests that are the same (i,.e taken for the ljf). Maybe you can retry that again, but then you need to save the sid AND the tid to be reused for the top equal requests (you only need stacktrace.record_inner() for one request). Its a nice optimization. The problem is when in between queue processing a new JFR chunk is started. This caused problems before. I would leave these kinds of optimizations for later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120501728 From aph at openjdk.org Mon Jun 2 09:06:59 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Jun 2025 09:06:59 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References:

Message-ID: On Sun, 1 Jun 2025 18:03:15 GMT, Markus Gr?nlund wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 344: >> >>> 342: >>> 343: // equals operator for JfrSampleRequest >>> 344: inline bool operator==(const JfrSampleRequest& lhs, const JfrSampleRequest& rhs) { >> >> Can be removed. > > Unless you still want to try the ljf JfrSampleRequest optimization for the native ljf, which I kind of like now that I understand it. As I said, it's a great optimization. But it needs some work. I therefore remove this method for now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120511048 From mbaesken at openjdk.org Mon Jun 2 09:11:05 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:11:05 GMT Subject: RFR: 8357155: [asan] ZGC does not work [v2] In-Reply-To: References: Message-ID: > Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). > This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. > It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' > This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: remove aarch64 from the change, adjust ppc64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25549/files - new: https://git.openjdk.org/jdk/pull/25549/files/ed2885ff..82a11f9b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25549&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25549&range=00-01 Stats: 5 lines in 2 files changed: 0 ins; 4 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25549.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25549/head:pull/25549 PR: https://git.openjdk.org/jdk/pull/25549 From mbaesken at openjdk.org Mon Jun 2 09:11:05 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:11:05 GMT Subject: RFR: 8357155: [asan] ZGC does not work In-Reply-To: References: Message-ID: On Fri, 30 May 2025 12:18:46 GMT, Matthias Baesken wrote: > Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). > This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. > It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' > This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . I think we handle just x86_64 and ppc64 in this change. Should I adjust the subject ? Btw Axel, should I add you as contributor, makes probably sense ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929574262 From shade at openjdk.org Mon Jun 2 09:11:55 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 09:11:55 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References:

Message-ID: On Mon, 2 Jun 2025 09:08:32 GMT, Matthias Baesken wrote: > I think we handle just x86_64 and ppc64 in this change. Should I adjust the subject ? Sounds good. We should probably make this explicit in the title. > Btw Axel, should I add you as contributor, makes probably sense ? Yeah, you can add me as a contributor. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929615698 From mbaesken at openjdk.org Mon Jun 2 09:23:56 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:23:56 GMT Subject: RFR: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25495#pullrequestreview-2887491611 From mdoerr at openjdk.org Mon Jun 2 09:23:56 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 09:23:56 GMT Subject: RFR: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25495#issuecomment-2929629790 From mdoerr at openjdk.org Mon Jun 2 09:23:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 09:23:57 GMT Subject: Integrated: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. This pull request has now been integrated. Changeset: 612f2c0c Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/612f2c0c0b75466c60d4b54dab6aa793a810c846 Stats: 75 lines in 2 files changed: 0 ins; 71 del; 4 mod 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() Reviewed-by: dbriemann, mbaesken ------------- PR: https://git.openjdk.org/jdk/pull/25495 From aboldtch at openjdk.org Mon Jun 2 09:24:52 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 2 Jun 2025 09:24:52 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References:

Message-ID: On Mon, 2 Jun 2025 04:28:02 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 119: > >> 117: _data = new_data; >> 118: _capacity = capacity; >> 119: } > > I assume there is a lock protecting this so it happens atomically? This happens before the signal handler is attached to thread. So it does happen before any parallelism is introduced on thread creation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120557327 From mbaesken at openjdk.org Mon Jun 2 09:32:56 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:32:56 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References:

Message-ID: <32poF3-6QghOwLYJ6GBMsAmGx8xcFOE9g5vqmoqzNJ0=.11438af8-f402-45e9-b74b-fcc963b2d169@github.com> On Mon, 2 Jun 2025 09:11:05 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove aarch64 from the change, adjust ppc64 contributor add xmas92 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929667209 From mbaesken at openjdk.org Mon Jun 2 09:35:51 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:35:51 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References:

Message-ID: On Mon, 2 Jun 2025 09:11:05 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove aarch64 from the change, adjust ppc64 contributor /add xmas92 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929681609 From duke at openjdk.org Mon Jun 2 09:43:08 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 2 Jun 2025 09:43:08 GMT Subject: RFR: 8284017: Improve handshake filtering mechanism Message-ID: Hi, please consider the following enhancement: In this PR a new way of supplying multiple arguments to filter out / skip operations in handshake/safepoint poll is given. Multiple boolean arguments are combined in a hash table, where keys are taken from a new enum `HandshakeOperationProperty`, which is to be modified when there is a need for a new argument. Tested in GHA and tiers 1 - 3. ------------- Commit messages: - 8284017: Changed variable name to operation_filter. - 8284017: Added typedef. - Merge remote-tracking branch 'origin/master' into JDK-8284017-handshake-filtering - 8284017: Added missed include statement. - 8284017: Changed to enum class for filter operation value. - 8284017: Added resource mark.s - 8284017: Combined bool params into resourceHashTable for filtering Changes: https://git.openjdk.org/jdk/pull/25497/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25497&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8284017 Stats: 66 lines in 9 files changed: 38 ins; 2 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/25497.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25497/head:pull/25497 PR: https://git.openjdk.org/jdk/pull/25497 From mdoerr at openjdk.org Mon Jun 2 09:47:28 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 09:47:28 GMT Subject: RFR: 8358013: [PPC64] VSX has poor performance on Power8 [v3] In-Reply-To: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: > Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. > > Note: Power8 is an old processor and performance optimizations for it are no longer planned. Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge remote-tracking branch 'origin' into PPC64_disable_SuperwordUseVSX_Power8 - Improve description of 8358013: [PPC64] VSXSuperwordUseVSX. - 8358013: [PPC64] VSX has poor performance on Power8 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25514/files - new: https://git.openjdk.org/jdk/pull/25514/files/1f8b0e91..599a4f36 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=01-02 Stats: 32865 lines in 385 files changed: 12812 ins; 12713 del; 7340 mod Patch: https://git.openjdk.org/jdk/pull/25514.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25514/head:pull/25514 PR: https://git.openjdk.org/jdk/pull/25514 From eosterlund at openjdk.org Mon Jun 2 10:11:53 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 10:11:53 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References:

Message-ID: On Fri, 30 May 2025 14:37:56 GMT, Axel Boldt-Christmas wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> remove aarch64 from the change, adjust ppc64 > > src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp line 95: > >> 93: const size_t max_address_offset_bits = valid_max_address_offset_bits - 3; >> 94: #ifdef ADDRESS_SANITIZER >> 95: return max_address_offset_bits; > > I think this actually has to be > ```c++ > return MIN2(valid_max_address_offset_bits, 44); > > > Because the way we probe we may otherwise return 45 here. Which could result in more than 44 bits in a ZOffset which our internal data structures cannot handle. Hopefully this still works for ASAN on PPC. (The `-3` is a left over from non-generational ZGC). Aarch64 could do the same, but it does not have this issue as it starts its probing at bit 46, not bit 47. > > _Side note: This makes me realise that there probably is a bug here on PPC and RISCV if running on a NUMA machine with more than 8 TB heap. As after ZGlobalsPointers::min_address_offset_request() was introduced we can return 45 from this function._ @xmas92: Thanks for looking into this! Should we set `DEFAULT_MAX_ADDRESS_BIT = 44` and use the constant? Or maybe file a separate issue for fixing that on aarch64, PPC64 and riscv? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2120738138 From epeter at openjdk.org Mon Jun 2 10:50:54 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 10:50:54 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References:

Message-ID: On Fri, 30 May 2025 08:15:22 GMT, Xiaohong Gong wrote: >>> @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> >>> Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> >>> https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> >>> I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> >> >>> > Yes, I also observed such regression. >>> > It would be nice if you proactively mentioned regressions, so it does not have to be pointed out by reviewers. >>> >>> For me, it could be ok to fix it in a follow-up patch. I think we are too close to RDP1 for JDK25 now anyway, and so we could push this patch here into JDK26, and then we have enough time in JDK26 to investigate the regression. Even better would be if we could do the other patch first, so we never even encounter a regression. >> >> Sounds good to me. Thanks! > >> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >> > >> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! > > Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! @XiaohongGong I reviewed https://github.com/openjdk/jdk/pull/25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2930007655 From ayang at openjdk.org Mon Jun 2 10:51:06 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 2 Jun 2025 10:51:06 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v9] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <-mRIrbyrBpxq1lZ2tfcxIuxRLh5lcoURlM-woAXM45k=.7c152a76-e34f-42ba-b9a7-323102b19371@github.com> > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to 25/26 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - merge - merge-fix - merge - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - review - ... and 2 more: https://git.openjdk.org/jdk/compare/83cb0c6d...08bc74e1 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=08 Stats: 4375 lines in 31 files changed: 522 ins; 3454 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From mgronlun at openjdk.org Mon Jun 2 11:06:31 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:06:31 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame [v2] In-Reply-To: References: Message-ID: > Greetings, > > Please see the JIRA issue for a detailed description. > > Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). > > Testing: jdk_jfr, JVMTI PopFrame tests > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: more precise comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25571/files - new: https://git.openjdk.org/jdk/pull/25571/files/b48c0635..70f75414 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25571&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25571&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25571.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25571/head:pull/25571 PR: https://git.openjdk.org/jdk/pull/25571 From mgronlun at openjdk.org Mon Jun 2 11:26:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:26:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: <45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> On Mon, 2 Jun 2025 08:58:28 GMT, Johannes Bechberger wrote: >> How so? > > Because we need to add the threads from the signal handler. So any kind of growing array or set would not work, especially if we want to remove the threads from within the signal handler again. > > This is certainly an area of future optimization, albeit this doesn't seem to have any measurable performance impact in my renaissance benchmark runs. I don't understand what allocation has to do with anything. I'm talking about code branch layout to avoid having to test "has_cpu_time_jfr_requests()" when we know it will be false by default. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120846868 From mgronlun at openjdk.org Mon Jun 2 11:28:59 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:28:59 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References:

Message-ID: <7Cy88EZJj1ZgHXaAoCY9m1PnB6UAGDJxgK9PI3BVYBQ=.a4fbad7a-19fa-4e1e-999e-8773d2fd7fb1@github.com> On Mon, 2 Jun 2025 09:02:05 GMT, Johannes Bechberger wrote: >> I see. With a bounded queue as used in this solution, it can work quite nicely, that is, if the thread is actually on CPU in native, and just not waiting - if waiting (which is most likely) then pending requests could take a long time to be sent to consumers. >> >> I also understand better the optimization you tried as part of async walk in native and frames. Also quite nice, to walk from the last JfrSampleRequest and do equals to "batch" the top JFR sample requests that are the same (i,.e taken for the ljf). Maybe you can retry that again, but then you need to save the sid AND the tid to be reused for the top equal requests (you only need stacktrace.record_inner() for one request). Its a nice optimization. > > The problem is when in between queue processing a new JFR chunk is started. This caused problems before. > > I would leave these kinds of optimizations for later. Then I would recommend you drain immediately when the thread is in native, not waiting for the queue to fill up to 2/3. The reason is because the solution is based on CPU time samples and most threads that are _thread_in_native are waiting (i.e. they will not get their queues filled while in native). I would recommend dropping the second clause about testing the queue size altogether. That way you will not get threads stuck with a lot of events a long time in native, not being delivered. Revive it later when you begin to attack the optimizations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120855119 From jbechberger at openjdk.org Mon Jun 2 11:32:27 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:32:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v27] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Tiny fixes - Minor changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/439763a3..6a83d759 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=25-26 Stats: 90 lines in 9 files changed: 24 ins; 29 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Mon Jun 2 11:40:00 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:40:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: <45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> References:

<45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> Message-ID: On Mon, 2 Jun 2025 11:22:45 GMT, Markus Gr?nlund wrote: >> Because we need to add the threads from the signal handler. So any kind of growing array or set would not work, especially if we want to remove the threads from within the signal handler again. >> >> This is certainly an area of future optimization, albeit this doesn't seem to have any measurable performance impact in my renaissance benchmark runs. > > I don't understand what allocation has to do with anything. I'm talking about code branch layout to avoid having to test "has_cpu_time_jfr_requests()" when we know it will be false by default. Ah. Sorry. Is it about reading the atomic boolean flag again? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120882396 From mgronlun at openjdk.org Mon Jun 2 11:40:02 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:40:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v27] In-Reply-To: References:

Message-ID: On Mon, 2 Jun 2025 11:32:27 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Tiny fixes > - Minor changes src/hotspot/share/runtime/thread.hpp line 59: > 57: class SafeThreadsListPtr; > 58: class ThreadClosure; > 59: class ThreadCrashProtection; Should not be needed. src/jdk.jfr/share/classes/jdk/jfr/internal/JVM.java line 276: > 274: * Set the maximum event emission rate for the CPU time sampler > 275: * > 276: * Setting rate to 0 turns off the CPU time method sampler. "CPU time method sampler" -> "CPU time sampler" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120878701 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120882161 From jbechberger at openjdk.org Mon Jun 2 11:51:26 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:51:26 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v28] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with three additional commits since the last revision: - Remove header includes - Always trigger async processing - Remove one atomic read ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/6a83d759..e482ad37 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=26-27 Stats: 21 lines in 6 files changed: 3 ins; 6 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Mon Jun 2 11:51:27 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:51:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v27] In-Reply-To: References:

<45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> Message-ID: On Mon, 2 Jun 2025 11:37:23 GMT, Johannes Bechberger wrote: >> I don't understand what allocation has to do with anything. I'm talking about code branch layout to avoid having to test "has_cpu_time_jfr_requests()" when we know it will be false by default. > > Ah. Sorry. Is it about reading the atomic boolean flag again? Right. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120897042 From jbechberger at openjdk.org Mon Jun 2 11:51:27 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:51:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: