From kvn at openjdk.org Sun Jun 1 00:30:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 00:30:50 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sat, 31 May 2025 22:18:33 GMT, Martin Doerr wrote: > Trivial build fix for PPC64 and s390. I haven't seen more affected platforms. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2884826176 From jbechberger at openjdk.org Sun Jun 1 07:13:00 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:13:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v25] In-Reply-To: References: Message-ID: <2nYqo0wpUrLLJV9iDRLwj5xjV06waCzu8Ma8YSAToIY=.1059ee96-77f8-47e6-8797-3f2b47783311@github.com> On Sat, 31 May 2025 10:37:29 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove debug printf > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 139: > >> 137: >> 138: // Trigger sampling while a thread is not in a safepoint, from a seperate thread >> 139: static void trigger_is_thread_in_native_stackwalking(); > > Is it sampling that is triggered? Sampling refers to the asynchronous signal received from the operating system (OS). > > You are asking for the sampler thread to process already taken JFR Sample Requests in the queue, right? Yes and I like your implied name better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2118819169 From jbechberger at openjdk.org Sun Jun 1 07:17:02 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:17:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v25] In-Reply-To: References: Message-ID: On Sat, 31 May 2025 10:09:15 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove debug printf > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 36: > >> 34: #if defined(LINUX) >> 35: >> 36: #include "memory/padded.hpp" > > What is padded? If not, this should go. Good catch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2118820425 From jbechberger at openjdk.org Sun Jun 1 07:22:58 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:22:58 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v24] In-Reply-To: <-QiSWEqppeW60aedVbLA3WTmnba7Fry53Qr86wE2EPs=.7a6327ce-7ef0-4b1c-bc68-0421ba3fd46f@github.com> References: <-QiSWEqppeW60aedVbLA3WTmnba7Fry53Qr86wE2EPs=.7a6327ce-7ef0-4b1c-bc68-0421ba3fd46f@github.com> Message-ID: On Fri, 30 May 2025 09:19:47 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/metadata/metadata.xml line 975: >> >>> 973: >>> 974: >>> 975: > >> I'm not a reviewer, but I just wanted to comment something I noticed. >> The JEP document says CPUTimeSampleLos'**t**', but the implementation says CPUTimeSampleLos'**s**'. Which one is correct? >> A sentence from the JEP document: >> >> Another new event,?`jdk.CPUTimeSampleLost`, is emitted when samples are lost ... > > Thanks for catching this mistake. I'll fix it this afternoon. I fixed it by changing the JEP. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2118825477 From jbechberger at openjdk.org Sun Jun 1 07:26:19 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 07:26:19 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Refactoring - Remove convoluted native trace logic ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/3a10d552..439763a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=24-25 Stats: 56 lines in 5 files changed: 3 ins; 27 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Sun Jun 1 13:04:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 13:04:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 42: > 40: #include "runtime/javaThread.hpp" > 41: #include "runtime/osThread.hpp" > 42: #include "runtime/safepointMechanism.hpp" Not needed, since you have the .inline.hpp src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 102: > 100: > 101: u4 JfrCPUTimeTraceQueue::size() const { > 102: return Atomic::load(&_head); Is this read from multiple threads? In that case, load_acquire(). src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 200: > 198: void sample_thread(JfrSampleRequest& request, void* ucontext, JavaThread* jt, JfrThreadLocal* tl); > 199: > 200: // sample all threads that are in native state (and requested to be sampled) We are not really "sampling", but processing their queues, no? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119128911 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119129239 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119129708 From mgronlun at openjdk.org Sun Jun 1 13:08:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 13:08:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 367: > 365: JfrCPUTimeSampleRequest& request = queue.at(i); > 366: JfrStackTrace stacktrace; > 367: traceid tid = JfrThreadLocal::thread_id(thread); Check the tid as a function of the JfrSampleRequest, like we do in JFR Cooperative Sampling. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119130991 From mgronlun at openjdk.org Sun Jun 1 13:12:01 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 13:12:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 413: > 411: } > 412: if (Atomic::load(&count) % 1000 == 0) { > 413: log_info(jfr)("CPU thread sampler sent %zu events, lost %d, biased %zu\n", Atomic::load(&count), Atomic::load(&_lost_samples_sum), Atomic::load(&biased_count)); put this logging under jfr+debug or log+trace please ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119137014 From mgronlun at openjdk.org Sun Jun 1 13:23:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 13:23:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 557: > 555: if (!check_state(jt) || > 556: jt->is_JfrRecorder_thread()) { > 557: queue.increment_lost_samples(); is_JfrRecorder_thread() will not appear here since it's excluded and would have returned nullptr from get_java_thread_if_valid(). src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 558: > 556: jt->is_JfrRecorder_thread()) { > 557: queue.increment_lost_samples(); > 558: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); Why is this restored here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119142346 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119142510 From jbechberger at openjdk.org Sun Jun 1 13:43:58 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 13:43:58 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 13:19:48 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 558: > >> 556: jt->is_JfrRecorder_thread()) { >> 557: queue.increment_lost_samples(); >> 558: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); > > Why is this restored here? Because I shouldn't sample if the thread isn't in native state anymore. The thread is probably sampled anyway on the outgoing safepoint. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119157906 From mgronlun at openjdk.org Sun Jun 1 15:07:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:07:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 362: > 360: drain_enqueued_requests(now, tl, jt, current); > 361: #ifdef LINUX > 362: if (tl->has_cpu_time_jfr_requests()) { You are having all threads traverse over this lock, even though the cpu time sampler is disabled by default. Can it be improved? src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 604: > 602: > 603: bool JfrThreadLocal::has_cpu_time_jfr_requests() { > 604: return Atomic::load(&_has_cpu_time_jfr_requests); Atomic::load_acquire() src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 620: > 618: > 619: bool JfrThreadLocal::wants_async_processing_of_cpu_time_jfr_requests() { > 620: return Atomic::load(&_do_async_processing_of_cpu_time_jfr_requests); Atomic::load_acquire() ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119242319 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119243305 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119243393 From jbechberger at openjdk.org Sun Jun 1 15:07:00 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 15:07:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: <62JxxY-xn3fwz0PnhcnIH6DOWBQUPIq_fhDD_7YrSmA=.bfbb317a-403e-4826-a3ed-c364882e821b@github.com> On Sun, 1 Jun 2025 15:01:06 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 362: > >> 360: drain_enqueued_requests(now, tl, jt, current); >> 361: #ifdef LINUX >> 362: if (tl->has_cpu_time_jfr_requests()) { > > You are having all threads traverse over this lock, even though the cpu time sampler is disabled by default. Can it be improved? Not without allocating in the signal handler ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119243238 From mgronlun at openjdk.org Sun Jun 1 15:27:06 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:27:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 332: > 330: JavaThread* jt = tlh.list()->thread_at(i); > 331: JfrThreadLocal* tl = jt->jfr_thread_local(); > 332: if (tl != nullptr && tl->wants_async_processing_of_cpu_time_jfr_requests()) { tl is never nullptr. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 344: > 342: > 343: // equals operator for JfrSampleRequest > 344: inline bool operator==(const JfrSampleRequest& lhs, const JfrSampleRequest& rhs) { Can be removed. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 574: > 572: > 573: if (queue.enqueue(request)) { > 574: tl->set_has_cpu_time_jfr_requests(true); This should only need to be set when enqueuing the first entry. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 581: > 579: > 580: if (jt->thread_state() == _thread_in_native && > 581: queue.size() > queue.capacity() * 2 / 3) { Is this logic still valid? You are only asking for a async processing depending on the load factor of the queue? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 586: > 584: JfrCPUTimeThreadSampling::trigger_async_processing_of_cpu_time_jfr_requests(); > 585: } else { > 586: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); Was it true before and needed a reset? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119250661 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119250887 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119248176 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119248824 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119249381 From jbechberger at openjdk.org Sun Jun 1 15:27:06 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 15:27:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 15:18:52 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 574: > >> 572: >> 573: if (queue.enqueue(request)) { >> 574: tl->set_has_cpu_time_jfr_requests(true); > > This should only need to be set when enqueuing the first entry. You're right > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 581: > >> 579: >> 580: if (jt->thread_state() == _thread_in_native && >> 581: queue.size() > queue.capacity() * 2 / 3) { > > Is this logic still valid? You are only asking for a async processing depending on the load factor of the queue? Yes, so I only start the thread walking if necessary ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119248709 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119250511 From mgronlun at openjdk.org Sun Jun 1 15:35:01 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:35:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 536: > 534: } > 535: > 536: volatile size_t count__ = 0; unused? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119258988 From jbechberger at openjdk.org Sun Jun 1 15:39:00 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sun, 1 Jun 2025 15:39:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: <6Idy8j9wbNr9udYMhsW0BQmhb8dQvc_p20vCYtg5kZc=.6380eee6-bd1b-45d0-bca8-c8068e59bd36@github.com> On Sun, 1 Jun 2025 15:32:08 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 536: > >> 534: } >> 535: >> 536: volatile size_t count__ = 0; > > unused? Yes. > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 586: > >> 584: JfrCPUTimeThreadSampling::trigger_async_processing_of_cpu_time_jfr_requests(); >> 585: } else { >> 586: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); > > Was it true before and needed a reset? I could check this before setting ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119260755 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119261558 From mgronlun at openjdk.org Sun Jun 1 15:43:06 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 15:43:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: <66tRvhjE2LrwccsAYmRycS6QLF2KdRg-XHfk-scr-wg=.c7f269f0-301a-4da3-ae54-7f6bc7a440b1@github.com> On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 587: > 585: } > 586: > 587: bool JfrThreadLocal::acquire_cpu_time_jfr_native_lock() { It appears that the lock state 'NATIVE' is redundant; an asynchronous request for queue drainage only requires the dequeue lock state. NATIVE can be removed to simplify the lock protocol. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119268003 From shade at openjdk.org Sun Jun 1 16:14:50 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Sun, 1 Jun 2025 16:14:50 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: <31NqA7K-ur9Y9SJ5jIHiPuG4KHm_GWMyYU79aCYbAsQ=.16bca797-7a68-41fd-88f9-c9afce90a247@github.com> On Sat, 31 May 2025 22:18:33 GMT, Martin Doerr wrote: > Trivial build fix for PPC64 and s390. I haven't seen more affected platforms. AFAICS with my builds that invoke CDS `-Xshare:dump` on cross-compiled binaries, ARM32 is failing the same way. I think we need to add a case here: https://github.com/openjdk/jdk/blob/c1b5f62a8c30038d3b1a14d184535ba0642d51c9/src/hotspot/cpu/arm/templateInterpreterGenerator_arm.cpp#L175-L179 ------------- PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2885791890 From mdoerr at openjdk.org Sun Jun 1 17:11:05 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sun, 1 Jun 2025 17:11:05 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: > Trivial build fix for PPC64 and s390. Added arm32. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Add arm32 fix. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25568/files - new: https://git.openjdk.org/jdk/pull/25568/files/f5df2535..25fb16bf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25568&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25568&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25568.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25568/head:pull/25568 PR: https://git.openjdk.org/jdk/pull/25568 From mgronlun at openjdk.org Sun Jun 1 18:12:58 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 18:12:58 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 15:24:17 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 344: > >> 342: >> 343: // equals operator for JfrSampleRequest >> 344: inline bool operator==(const JfrSampleRequest& lhs, const JfrSampleRequest& rhs) { > > Can be removed. Unless you still want to try the ljf JfrSampleRequest optimization for the native ljf, which I kind of like now that I understand it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119386104 From mgronlun at openjdk.org Sun Jun 1 18:13:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 18:13:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 15:23:06 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 581: >> >>> 579: >>> 580: if (jt->thread_state() == _thread_in_native && >>> 581: queue.size() > queue.capacity() * 2 / 3) { >> >> Is this logic still valid? You are only asking for async processing assistance depending on the load factor of the queue? > > Yes, so I only start the thread walking if necessary I see. With a bounded queue as used in this solution, it can work quite nicely, that is, if the thread is actually on CPU in native, and just not waiting - if waiting (which is most likely) then pending requests could take a long time to be sent to consumers. I also understand better the optimization you tried as part of async walk in native and frames. Also quite nice, to walk from the last JfrSampleRequest and do equals to "batch" the top JFR sample requests that are the same (i,.e taken for the ljf). Maybe you can retry that again, but then you need to save the sid AND the tid to be reused for the top equal requests (you only need stacktrace.record_inner() for one request). Its a nice optimization. >> src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 362: >> >>> 360: drain_enqueued_requests(now, tl, jt, current); >>> 361: #ifdef LINUX >>> 362: if (tl->has_cpu_time_jfr_requests()) { >> >> You are having all threads traverse over this test, even though the cpu time sampler is disabled by default. Can it be improved? > > Not without allocating in the signal handler How so? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119385303 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119389715 From mgronlun at openjdk.org Sun Jun 1 18:25:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 18:25:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 250: > 248: } > 249: > 250: biased = true; Perhaps set on entry, and only keep the single biased = false below? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119396997 From mgronlun at openjdk.org Sun Jun 1 18:31:58 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 18:31:58 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 18:22:10 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 250: > >> 248: } >> 249: >> 250: biased = true; > > Perhaps set on entry, and only keep the single biased = false below? Also, note you have a direct hit in line 221--222 above - it's biased = false. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119404072 From iveresov at openjdk.org Sun Jun 1 19:05:01 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 1 Jun 2025 19:05:01 GMT Subject: RFR: 8358236: [AOT] Graal crashes when trying to use persisted MDOs Message-ID: Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. Testing looks clean. ------------- Commit messages: - Null out MethodData::_failed_speculations before snapshot Changes: https://git.openjdk.org/jdk/pull/25570/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25570&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358236 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25570/head:pull/25570 PR: https://git.openjdk.org/jdk/pull/25570 From mgronlun at openjdk.org Sun Jun 1 20:38:29 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 1 Jun 2025 20:38:29 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame Message-ID: Greetings, Please see the JIRA issue for a detailed description. Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). Testing: jdk_jfr, JVMTI PopFrame tests Thanks Markus ------------- Commit messages: - 8357962 Changes: https://git.openjdk.org/jdk/pull/25571/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25571&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357962 Stats: 3 lines in 3 files changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25571.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25571/head:pull/25571 PR: https://git.openjdk.org/jdk/pull/25571 From kvn at openjdk.org Sun Jun 1 21:23:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 21:23:53 GMT Subject: RFR: 8358236: [AOT] Graal crashes when trying to use persisted MDOs In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 19:01:27 GMT, Igor Veresov wrote: > Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. > Testing looks clean. Trivial. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25570#pullrequestreview-2886119546 From iveresov at openjdk.org Sun Jun 1 21:23:54 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 1 Jun 2025 21:23:54 GMT Subject: Integrated: 8358236: [AOT] Graal crashes when trying to use persisted MDOs In-Reply-To: References: Message-ID: <2VQGaTWxeSr29uU3Ih3S5kF9l70w3xwlkHNG_pVFr7U=.3279eb7c-5bf8-4df1-8405-61b1678552d5@github.com> On Sun, 1 Jun 2025 19:01:27 GMT, Igor Veresov wrote: > Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. > Testing looks clean. This pull request has now been integrated. Changeset: 85e36d79 Author: Igor Veresov URL: https://git.openjdk.org/jdk/commit/85e36d79246913abb8b85c2be719670655d619ab Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8358236: [AOT] Graal crashes when trying to use persisted MDOs Reviewed-by: kvn ------------- PR: https://git.openjdk.org/jdk/pull/25570 From dholmes at openjdk.org Mon Jun 2 02:11:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 02:11:57 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: On Fri, 30 May 2025 19:34:16 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are included to check the performance of specific input value ranges to help prevent regressions in the future. >> >> The command to run all range specific micro-benchmarks is posted below. >> >> `make test TEST="micro:CbrtPerf.CbrtPerfRanges"` >> >> The results of all tests posted below were captured with an [Intel? Xeon 6761P](https://www.intel.com/content/www/us/en/products/sku/241842/intel-xeon-6761p-processor-336m-cache-2-50-ghz/specifications.html) using [OpenJDK v25-b21](https://github.com/openjdk/jdk/releases/tag/jdk-25%2B21) as the baseline version. >> >> For performance data collected with the new built in range micro-benchmark, see the table below. Each result is the mean of 8 individual runs, and the input ranges used match those from the original Java implementation. Overall, the intrinsic provides a major uplift of 169% when very small inputs are used and a more modest uplift of 45% for all other inputs. >> >> | Input range(s) | Baseline throughput (ops/ms) | Intrinsic throughput (ops/ms) | Speedup | >> | :-------------------------------------: | :-------------------------------: | :-------------------------------: | :---------: | >> | [-2^(-1022), 2^(-1022)] | 6568 | 17678 | 2.69x | >> | (-INF, -2^(-1022)], [2^(-1022), INF) | 138932 | 200897 | 1.45x | >> >> Finally, the `jtreg:test/jdk/java/lang/Math/CubeRootTests.java` test passed with the changes. > > Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision: > > Set address attributes in movapd assembly instruction function definition This change also broke most of the non-x86 platforms, due to the new intrinsic not being implemented on those platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2928415483 From amitkumar at openjdk.org Mon Jun 2 03:26:58 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 2 Jun 2025 03:26:58 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Thanks Martin, for fixing it. ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2886565212 From duke at openjdk.org Mon Jun 2 03:52:07 2025 From: duke at openjdk.org (Mohamed Issa) Date: Mon, 2 Jun 2025 03:52:07 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: On Mon, 2 Jun 2025 02:08:55 GMT, David Holmes wrote: > This change also broke most of the non-x86 platforms, due to the new intrinsic not being implemented on those platforms. When you say "most of the non-x86 platforms", are you referring to the ones with processor types listed below? 1. jdk/src/hotspot/cpu/**arm** 2. jdk/src/hotspot/cpu/**ppc** 3. jdk/src/hotspot/cpu/**s390** I don't see a cbrt intrinsic implementation in the non-x86 platforms. However, the ones listed above appear to get to the _ShouldNotReachHere_ error state if a particular intrinsic isn't found in `TemplateInterpreterGenerator::generate_math_entry` (`templateInterpreterGenerator_*.cpp`). It looks like aarch64 and riscv don't take that route and would fall back to the default cbrt implementation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2928618217 From dholmes at openjdk.org Mon Jun 2 04:35:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:35:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 07:26:19 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Refactoring > - Remove convoluted native trace logic Just some drive-by comments mainly on your acquire/release usage. I'm not at all clear what memory accesses you are trying to coordinate with those. src/hotspot/share/jfr/jni/jfrJniMethod.cpp line 176: > 174: JfrEventSetting::set_enabled(JfrCPUTimeSampleEvent, rate > 0); > 175: JfrCPUTimeThreadSampling::set_rate(rate, autoadapt == JNI_TRUE); > 176: return JNI_TRUE; What is the point of having a boolean return type if you always return true? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 59: > 57: Thread* raw_thread = Thread::current_or_null_safe(); > 58: JavaThread* jt; > 59: if (raw_thread == nullptr || !raw_thread->is_Java_thread()) { // this can happen due to the high level of parralelism Suggestion: if (raw_thread == nullptr || !raw_thread->is_Java_thread()) { // this can happen due to the high level of parallelism src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 119: > 117: _data = new_data; > 118: _capacity = capacity; > 119: } I assume there is a lock protecting this so it happens atomically? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 122: > 120: > 121: bool JfrCPUTimeTraceQueue::is_full() const { > 122: return Atomic::load_acquire(&_head) >= _capacity; I don't see why acquire semantics would be needed here. Also how can it be > capacity? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 126: > 124: > 125: bool JfrCPUTimeTraceQueue::is_empty() const { > 126: return Atomic::load_acquire(&_head) == 0; Acquire semantics are definitely not needed here. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 130: > 128: > 129: s4 JfrCPUTimeTraceQueue::lost_samples() const { > 130: return Atomic::load_acquire(&_lost_samples); Again acquire semantics seem highly dubious here - what loads are you synchronizing with? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 139: > 137: > 138: u4 JfrCPUTimeTraceQueue::get_and_reset_lost_samples() { > 139: s4 lost_samples = Atomic::load_acquire(&_lost_samples); Again acquire semantics seem highly dubious here - what loads are you synchronizing with? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 151: > 149: set_capacity(capacity); > 150: } > 151: } Seems an odd definition - typically `ensure_capacity` will grow a data structure to ensure it has sufficient capacity, and if already larger than needed that is fine. Suggestion `change_capacity`, or more traditionally `resize`? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 237: > 235: > 236: void JfrCPUTimeThreadSampler::trigger_async_processing_of_cpu_time_jfr_requests() { > 237: Atomic::release_store(&_is_async_processing_of_cpu_time_jfr_requests_triggered, true); What prior stores are you ensuring should be visible by using release semantics here? ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2886627655 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119983062 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119983911 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120016607 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120011705 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120012200 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120014449 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120014541 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120020174 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120021034 From dholmes at openjdk.org Mon Jun 2 04:35:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:35:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v5] In-Reply-To: References: <6hGNW2D3_VuD-2WN0eTLYdEJoNu_9rPLu-dH-InGSK4=.64de8bc8-a98f-400f-a5e3-885dbd84d901@github.com> Message-ID: <7wOUvZZtjrX3TpgT9JQLm-8qTAax6PrXtfHwMJpNX4M=.13a7c6cc-e037-4108-b392-7ff30d279c05@github.com> On Mon, 26 May 2025 06:29:03 GMT, Johannes Bechberger wrote: >> Also, is raw_thread == nullptr even possible? For the same reasons. > > `!raw_thread->is_Java_thread()` I found it during testing. What thread was it, and how did it reach this code? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119984783 From dholmes at openjdk.org Mon Jun 2 04:44:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:44:57 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: On Mon, 2 Jun 2025 03:49:42 GMT, Mohamed Issa wrote: > When you say "most of the non-x86 platforms", are you referring to the ones with processor types listed below? Yes - 3 of the 5 non-x86 platforms. > It looks like aarch64 and riscv don't take that route and would fall back to the default cbrt implementation. I was wondering why Aarch64 didn't fail. I guess the other platforms may use this to detect new intrinsics being added. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2928722575 From dholmes at openjdk.org Mon Jun 2 04:50:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 04:50:57 GMT Subject: RFR: 8357576: FieldInfo::_index is not initialized by the constructor In-Reply-To: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> References: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> Message-ID: On Fri, 30 May 2025 19:07:24 GMT, Matias Saavedra Silva wrote: > FieldInfo::_index is not initialized in either of the FieldInfo constructors so this patch adds initialization to both constructors. Verified with tier 1-5 tests Good and trivial, but does need copyright year update. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25554#pullrequestreview-2886701153 From kbarrett at openjdk.org Mon Jun 2 05:33:41 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 05:33:41 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept Message-ID: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Please review this change to permit the use of `noexcept` under certain circumstances in HotSpot code. http://wg21.link/n3050 Testing: JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the conversion would look like. It will need to be brought up to current mainline, possibly with modifications. This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 16-June-2025 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. ------------- Commit messages: - add noexcept Changes: https://git.openjdk.org/jdk/pull/25574/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8255082 Stats: 104 lines in 2 files changed: 104 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25574/head:pull/25574 PR: https://git.openjdk.org/jdk/pull/25574 From kbarrett at openjdk.org Mon Jun 2 05:48:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 05:48:59 GMT Subject: RFR: 8358205: Remove unused JFR array allocation code In-Reply-To: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> References: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> Message-ID: On Fri, 30 May 2025 18:10:07 GMT, Coleen Phillimore wrote: > The JFR code is using ObjArray->allocate() directly rather than going through oopFactory. In Valhalla, the oopFactory code is being changed to account for new array shapes and attributes, so all code should call that instead. Turns out this function is unused, so this change removes it. Tested with tier1-7 with a ShouldNotReachHere(), then jdk/jfr tests with the removal. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25553#pullrequestreview-2886834584 From dbriemann at openjdk.org Mon Jun 2 05:53:50 2025 From: dbriemann at openjdk.org (David Briemann) Date: Mon, 2 Jun 2025 05:53:50 GMT Subject: RFR: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. LGTM, Thank you! ------------- Marked as reviewed by dbriemann (Author). PR Review: https://git.openjdk.org/jdk/pull/25495#pullrequestreview-2886849573 From eosterlund at openjdk.org Mon Jun 2 06:27:50 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 06:27:50 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 20:33:50 GMT, Markus Gr?nlund wrote: > Greetings, > > Please see the JIRA issue for a detailed description. > > Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). > > Testing: jdk_jfr, JVMTI PopFrame tests > > Thanks > Markus Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25571#pullrequestreview-2886927096 From fyang at openjdk.org Mon Jun 2 06:41:55 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 2 Jun 2025 06:41:55 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame In-Reply-To: References: Message-ID: <1Y5-9j2Z4EIDS0Ftrkr8S-KT1MlrtB9jYwjzX72adrs=.d4f6f733-13cf-4473-b63a-c42c46beffd3@github.com> On Sun, 1 Jun 2025 20:33:50 GMT, Markus Gr?nlund wrote: > Greetings, > > Please see the JIRA issue for a detailed description. > > Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). > > Testing: jdk_jfr, JVMTI PopFrame tests > > Thanks > Markus FYI: `hotspot_serviceability` and `jdk_svc` test good on linux-riscv64 platform. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25571#issuecomment-2929043313 From dholmes at openjdk.org Mon Jun 2 07:02:52 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 07:02:52 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: <8ueGNCZGkc0fbJHYg8l2XPSG0w2DAxKf4e59ClyXhGw=.5497fc78-f598-4af4-b745-d05f7115e953@github.com> On Mon, 2 Jun 2025 05:28:17 GMT, Kim Barrett wrote: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. I approve of this change. A couple of minor tweaks to the text suggested. Thanks doc/hotspot-style.md line 1114: > 1112: > 1113: * Only the abbreviated form of `noexcept` exception specifications are > 1114: permitted. `noexcept` exception specifications with arguments are forbidden. Suggestion: * Only the argument-less form of `noexcept` exception specifications is permitted. doc/hotspot-style.md line 1131: > 1129: > 1130: The second is to allow the compiler and library code to choose different > 1131: algorithms, depending on whether a some function may throw exceptions. This is Suggestion: algorithms, depending on whether some function may throw exceptions. This is doc/hotspot-style.md line 1139: > 1137: such a function `noexcept` informs the compiler that `nullptr` is a possible > 1138: result. If an allocation function is not declared `noexcept` then the compiler > 1139: may elide that checking and handling for a using `new` expression. Suggestion: may elide that checking and handling for a `new` expression. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25574#pullrequestreview-2887010579 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120226615 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120229061 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120234324 From eosterlund at openjdk.org Mon Jun 2 07:31:54 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 07:31:54 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent In-Reply-To: References: Message-ID: On Fri, 30 May 2025 09:40:21 GMT, Andrew Haley wrote: > > > It would surely be better if this evil were expunged from JDK 21 as well, lest it also confuse a backporter. > > > > > > Maybe a "here be dragons" warning would suffice. > > If you add the following comment above every call to `do_oop_store()` I'll approve this patch: > > `// Clobbers: r10, r11, r3` Hmm yes that feels like a good compromise. I added the comment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25483#issuecomment-2929209038 From mbaesken at openjdk.org Mon Jun 2 07:33:27 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 07:33:27 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured Message-ID: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . Those fail when the address sanitizer is configured ( --enable-asan ). The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . While at it, also same is also added for ubsan . ------------- Commit messages: - remove zgc change - JDK-8357826 Changes: https://git.openjdk.org/jdk/pull/25575/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357826 Stats: 56 lines in 12 files changed: 54 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Mon Jun 2 07:33:27 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 07:33:27 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 07:25:22 GMT, Matthias Baesken wrote: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . The change to src/hotspot/cpu/x86/gc/z/zAddress_x86.cpp was added because of zgc issues with ASAN but we will address this in another change so I remove it from here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2929201143 From rvansa at openjdk.org Mon Jun 2 07:36:51 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 07:36:51 GMT Subject: RFR: 8352075: Perf regression accessing fields [v16] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Add type cast ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/70f62460..9cba2d4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=14-15 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From jbhateja at openjdk.org Mon Jun 2 07:44:58 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Jun 2025 07:44:58 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][3]). > > Two key areas require improvement: > 1. At the Java level, vector indices generated for range validation could be reused for the subsequent gather load operation on architectures with native vector instructions like AArch64 SVE. However, the current implementation prevents compiler reuse of these index vectors due to divergent control flow, potentially impacting performance. > 2. At the compiler IR level, the additional `offset` input for `LoadVectorGather`/`LoadVectorGatherMasked` with subword types increases IR complexity and complicates backend implementation. Furthermore, generating `add` instructions before each memory access negatively impacts performance. > > This patch refactors the implementation at both the Java level and compiler mid-end to improve efficiency and maintainability across different architectures. > > Main changes: > 1. Java-side API refactoring: > - Explicitly passes generated index vectors to hotspot, eliminating duplicate index vectors for gather load instructions on > architectures like AArch64. > 2. C2 compiler IR refactoring: > - Refactors `LoadVectorGather`/`LoadVectorGatherMasked` IR for subword types by removing the memory offset input and incorporating it into the memory base `addr` at the IR level. This simplifies backend implementation, reduces add operations, and unifies the IR across all types. > 3. Backend changes: > - Streamlines X86 implementation of subword gather operations following the removal of the offset input from the IR level. > > Performance: > The performance of the relative JMH improves up to 27% on a X86 AVX512 system. Please see the data below: > > Benchmark Mode Cnt Unit SIZE Before After Gain > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 64 53682.012 52650.325 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 256 14484.252 14255.156 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 1024 3664.900 3595.615 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 4096 908.312 935.269 1.02 > GatherOperationsBenchmark.micr... Hi @XiaohongGong , Looks good to me, thanks again for this re-factor !! Best Regards, Jatin ------------- Marked as reviewed by jbhateja (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25138#pullrequestreview-2887157235 From eosterlund at openjdk.org Mon Jun 2 07:48:39 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 07:48:39 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References: Message-ID: > The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. > > My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. > > This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Add comment about clobbered registers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25483/files - new: https://git.openjdk.org/jdk/pull/25483/files/44f7e092..c9440f68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25483&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25483&range=00-01 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25483.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25483/head:pull/25483 PR: https://git.openjdk.org/jdk/pull/25483 From mbaesken at openjdk.org Mon Jun 2 08:07:38 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 08:07:38 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: TestBreakSignalThreadDump has issues with asan ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/3ad0d93a..aa796c8a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Mon Jun 2 08:07:38 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 08:07:38 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <4CZpPTh4S1qjEkxVcHZ-J8bxpkI4iTsOtX4iCG5M2Cw=.8c1f2e8e-02c1-4691-8d6f-aa362dd54932@github.com> On Mon, 2 Jun 2025 07:25:22 GMT, Matthias Baesken wrote: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . TestBreakSignalThreadDump shows this, so it does not work well with asan too stdout: []; stderr: [==12484==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2929322761 From rvansa at openjdk.org Mon Jun 2 08:14:48 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 08:14:48 GMT Subject: RFR: 8352075: Perf regression accessing fields [v17] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains two new commits since the last revision: - Add type cast - Fix static_assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/9cba2d4a..c592ea59 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=15-16 Stats: 53 lines in 4 files changed: 0 ins; 47 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From shade at openjdk.org Mon Jun 2 08:16:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 08:16:54 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Looks good, thanks! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2887258170 From mbaesken at openjdk.org Mon Jun 2 08:20:53 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 08:20:53 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25568#pullrequestreview-2887272244 From kbarrett at openjdk.org Mon Jun 2 08:21:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 08:21:34 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: dholmes review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25574/files - new: https://git.openjdk.org/jdk/pull/25574/files/6364b3d4..e6decd1f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=00-01 Stats: 8 lines in 2 files changed: 1 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25574/head:pull/25574 PR: https://git.openjdk.org/jdk/pull/25574 From kbarrett at openjdk.org Mon Jun 2 08:21:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 08:21:34 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: <8ueGNCZGkc0fbJHYg8l2XPSG0w2DAxKf4e59ClyXhGw=.5497fc78-f598-4af4-b745-d05f7115e953@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> <8ueGNCZGkc0fbJHYg8l2XPSG0w2DAxKf4e59ClyXhGw=.5497fc78-f598-4af4-b745-d05f7115e953@github.com> Message-ID: On Mon, 2 Jun 2025 06:58:39 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> dholmes review > > doc/hotspot-style.md line 1139: > >> 1137: such a function `noexcept` informs the compiler that `nullptr` is a possible >> 1138: result. If an allocation function is not declared `noexcept` then the compiler >> 1139: may elide that checking and handling for a using `new` expression. > > Suggestion: > > may elide that checking and handling for a `new` expression. Instead changed to "may elide that checking and handling for a `new` expression calling that function." It's not _any_ `new` expression that might have stuff elided, only one that calls the not-nothrow allocation function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2120385617 From kbarrett at openjdk.org Mon Jun 2 08:24:01 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Jun 2025 08:24:01 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 05:28:17 GMT, Kim Barrett wrote: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. I forgot to mention that of course the current code is out of conformance with this, since we're currently using `throw()` to declare allocation functions as being nothrow. Once this style guide is approved, we (probably meaning I) will need to update the code accordingly. Probably not as a big query-replace either, as I've already found one mistake. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2929385129 From shade at openjdk.org Mon Jun 2 08:25:00 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 08:25:00 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 07:48:39 GMT, Erik ?sterlund wrote: >> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. >> >> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. >> >> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment about clobbered registers Well, since we are introducing the hunks near `do_oop_store`-s, and thus extending the scope of the patch. At this point, we can just inline `do_oop_store` (and maybe `do_oop_load`?), like Andrew initially suggested. This will also match what RISC-V already did: https://github.com/openjdk/jdk/commit/c5a1543ee3e68775f09ca29fb07efd9aebfdb33e ------------- PR Review: https://git.openjdk.org/jdk/pull/25483#pullrequestreview-2887283595 From mdoerr at openjdk.org Mon Jun 2 08:31:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 08:31:57 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25568#issuecomment-2929404503 From mdoerr at openjdk.org Mon Jun 2 08:31:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 08:31:57 GMT Subject: Integrated: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 In-Reply-To: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: <4qHafyELt_8KULAwgyl9NSO8VGsIlEAxQp7XCFCFVb8=.f57fa1e6-8b54-4f88-b052-0cfd1b0114d9@github.com> On Sat, 31 May 2025 22:18:33 GMT, Martin Doerr wrote: > Trivial build fix for PPC64 and s390. Added arm32. This pull request has now been integrated. Changeset: 40ce05d4 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/40ce05d4080a9a2b4876c21f83a184f9b8a580a2 Stats: 3 lines in 3 files changed: 3 ins; 0 del; 0 mod 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 Reviewed-by: shade, amitkumar, mbaesken, kvn ------------- PR: https://git.openjdk.org/jdk/pull/25568 From ayang at openjdk.org Mon Jun 2 08:42:02 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 2 Jun 2025 08:42:02 GMT Subject: RFR: 8358294: Remove unnecessary GenAlignment Message-ID: Simple replacement of `GenAlignment` with `SpaceAlignment`, because they always have the same value. Removing the former to reduce complexity. Test: tier1-3 ------------- Commit messages: - remove-gen-alignment Changes: https://git.openjdk.org/jdk/pull/25577/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25577&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358294 Stats: 105 lines in 16 files changed: 0 ins; 46 del; 59 mod Patch: https://git.openjdk.org/jdk/pull/25577.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25577/head:pull/25577 PR: https://git.openjdk.org/jdk/pull/25577 From jbechberger at openjdk.org Mon Jun 2 08:44:01 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 08:44:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 13:01:23 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 200: > >> 198: void sample_thread(JfrSampleRequest& request, void* ucontext, JavaThread* jt, JfrThreadLocal* tl); >> 199: >> 200: // sample all threads that are in native state (and requested to be sampled) > > We are not really "sampling", but processing their queues, no? You're correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120450563 From jwaters at openjdk.org Mon Jun 2 08:46:52 2025 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 2 Jun 2025 08:46:52 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 08:20:57 GMT, Kim Barrett wrote: > I forgot to mention that of course the current code is out of conformance with this, since we're currently using `throw()` to declare allocation functions as being nothrow. Once this style guide is approved, we (probably meaning I) will need to update the code accordingly. Probably not as a big query-replace either, as I've already found one mistake. If it's easier I can bring the original change to noexcept Pull Request back from the dead and remove the merge mistakes that leaked in from my other branch, which shouldn't really be that difficult to do. Not sure which code is potentially marked throw() wrongly though. Alternatively, we could just keep throw() alongside noexcept for code that already uses it, to avoid code churn. They do mean the same thing in C++17, after all (I was going to mention that there are papers for static exception specifications that propose reintroducing throw() back into C++ last I remembered, but realized that this likely doesn't mean much for us now, so this point can be ignored) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2929473632 From jbechberger at openjdk.org Mon Jun 2 08:47:01 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 08:47:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: <3d549Fxkhzd6v0fAVFEBOcxZ7hBKI1ZAUafLClp7Npw=.70183618-7dbf-4e05-bcc8-fd1216741c66@github.com> On Sun, 1 Jun 2025 13:05:44 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 367: > >> 365: JfrCPUTimeSampleRequest& request = queue.at(i); >> 366: JfrStackTrace stacktrace; >> 367: traceid tid = JfrThreadLocal::thread_id(thread); > > Check the tid as a function of the JfrSampleRequest, like we do in JFR Cooperative Sampling. You mean ` const traceid tid = in_continuation ? tl->vthread_id_with_epoch_update(jt) : JfrThreadLocal::jvm_thread_id(jt);`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120458307 From jbechberger at openjdk.org Mon Jun 2 08:53:02 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 08:53:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: <3d549Fxkhzd6v0fAVFEBOcxZ7hBKI1ZAUafLClp7Npw=.70183618-7dbf-4e05-bcc8-fd1216741c66@github.com> References: <3d549Fxkhzd6v0fAVFEBOcxZ7hBKI1ZAUafLClp7Npw=.70183618-7dbf-4e05-bcc8-fd1216741c66@github.com> Message-ID: On Mon, 2 Jun 2025 08:44:01 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 367: >> >>> 365: JfrCPUTimeSampleRequest& request = queue.at(i); >>> 366: JfrStackTrace stacktrace; >>> 367: traceid tid = JfrThreadLocal::thread_id(thread); >> >> Check the tid as a function of the JfrSampleRequest, like we do in JFR Cooperative Sampling. > > You mean ` const traceid tid = in_continuation ? tl->vthread_id_with_epoch_update(jt) : JfrThreadLocal::jvm_thread_id(jt);`? I implemented this in this function now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120473792 From eosterlund at openjdk.org Mon Jun 2 08:56:56 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 08:56:56 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 07:48:39 GMT, Erik ?sterlund wrote: >> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. >> >> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. >> >> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment about clobbered registers > Well, since we are introducing the hunks near `do_oop_store`-s, and thus extending the scope of the patch. At this point, we can just inline `do_oop_store` (and maybe `do_oop_load`?), like Andrew initially suggested. This will also match what RISC-V already did: [c5a1543](https://github.com/openjdk/jdk/commit/c5a1543ee3e68775f09ca29fb07efd9aebfdb33e) RISC-V doesn't really have the backporting until JDK 8 problem. I'd really like to make that cosmetic change in the next follow-up PR instead, as previously discussed. The comments hold true all the way back to JDK 8 and don't change the logic, so I can go along with that. And I'd rather take the risk of getting some comment wrong on the way back to JDK 8, than fiddling with the guts of all this unrelated code, that has changed substantially since back then. Does that sound okay? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25483#issuecomment-2929515436 From jbechberger at openjdk.org Mon Jun 2 08:57:04 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 08:57:04 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 13:41:44 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 558: >> >>> 556: jt->is_JfrRecorder_thread()) { >>> 557: queue.increment_lost_samples(); >>> 558: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); >> >> Why is this restored here? > > Because I shouldn't sample if the thread isn't in native state anymore. The thread is probably sampled anyway on the outgoing safepoint. But you might be right, I removed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120481274 From aboldtch at openjdk.org Mon Jun 2 08:59:29 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 2 Jun 2025 08:59:29 GMT Subject: RFR: 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value Message-ID: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> The way that ZPlatformAddressOffsetBits is implemented on riscv and ppc may result in a return value of 45. This is larger than the max supported value of 44 (because of other internal data structures). This was fixed in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) for aarch64. Before [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) the issue on manifested if one tried to select a heap larger than 16 TB (not supported), but after [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) we try to double the heap address space when running on a NUMA machine. So we may now encounter this bug for heaps larger than 8TB (which is supported). While ZPlatformAddressOffsetBits needs an overhaul. (It was written for non-generational ZGC where we had the three color bits inside the address.) The proposal is that we solve this for ppc and riscv by doing the same thing we did for aarch64 in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) ------------- Commit messages: - 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value Changes: https://git.openjdk.org/jdk/pull/25578/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25578&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358310 Stats: 10 lines in 2 files changed: 4 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25578.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25578/head:pull/25578 PR: https://git.openjdk.org/jdk/pull/25578 From jbechberger at openjdk.org Mon Jun 2 09:01:05 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 09:01:05 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 18:10:15 GMT, Markus Gr?nlund wrote: >> Not without allocating in the signal handler > > How so? Because we need to add the threads from the signal handler. So any kind of growing array or set would not work, especially if we want to remove the threads from within the signal handler again. This is certainly an area of future optimization, albeit this doesn't seem to have any measurable performance impact in my renaissance benchmark runs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120492743 From mbaesken at openjdk.org Mon Jun 2 09:03:55 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:03:55 GMT Subject: RFR: 8357155: [asan] ZGC does not work In-Reply-To: References: Message-ID: <_4nt7X3dG4RfwD7R_no-3YCTNIUkWh0s6o4-eFQjHJw=.98f7be0d-b7ae-4a14-b4b8-459b6ed2c615@github.com> On Fri, 30 May 2025 15:00:53 GMT, Axel Boldt-Christmas wrote: > I was hoping this could work for Linux with 47/48 bit aarch64 VMA. But it is unclear how ASAN selects its mappings on such platforms. > > On 39/42 bit VMA returning `MIN2(valid_max_address_offset_bits, 44)` as I suggested in the PPC function may be a better best effort, as we are using addresses where we actually probed that reservations could be possible). Or even `MIN2(valid_max_address_offset_bits - 1, 44)`. Feel free to try it out, but I think this is otherwise an alright approach until we implement a better heap base selection strategy where we can test multiple base candidates. Thanks for the aarch64 related suggestions, unfortunately both do not work. So I change only the files for x86_64 and ppc64 . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2120500717 From jbechberger at openjdk.org Mon Jun 2 09:05:02 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 09:05:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 18:00:55 GMT, Markus Gr?nlund wrote: >> Yes, so I only start the thread walking if necessary > > I see. With a bounded queue as used in this solution, it can work quite nicely, that is, if the thread is actually on CPU in native, and just not waiting - if waiting (which is most likely) then pending requests could take a long time to be sent to consumers. > > I also understand better the optimization you tried as part of async walk in native and frames. Also quite nice, to walk from the last JfrSampleRequest and do equals to "batch" the top JFR sample requests that are the same (i,.e taken for the ljf). Maybe you can retry that again, but then you need to save the sid AND the tid to be reused for the top equal requests (you only need stacktrace.record_inner() for one request). Its a nice optimization. The problem is when in between queue processing a new JFR chunk is started. This caused problems before. I would leave these kinds of optimizations for later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120501728 From aph at openjdk.org Mon Jun 2 09:06:59 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Jun 2025 09:06:59 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 07:48:39 GMT, Erik ?sterlund wrote: >> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. >> >> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. >> >> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment about clobbered registers Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25483#pullrequestreview-2887436174 From jbechberger at openjdk.org Mon Jun 2 09:09:04 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 09:09:04 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 18:03:15 GMT, Markus Gr?nlund wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 344: >> >>> 342: >>> 343: // equals operator for JfrSampleRequest >>> 344: inline bool operator==(const JfrSampleRequest& lhs, const JfrSampleRequest& rhs) { >> >> Can be removed. > > Unless you still want to try the ljf JfrSampleRequest optimization for the native ljf, which I kind of like now that I understand it. As I said, it's a great optimization. But it needs some work. I therefore remove this method for now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120511048 From mbaesken at openjdk.org Mon Jun 2 09:11:05 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:11:05 GMT Subject: RFR: 8357155: [asan] ZGC does not work [v2] In-Reply-To: References: Message-ID: > Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). > This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. > It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' > This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: remove aarch64 from the change, adjust ppc64 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25549/files - new: https://git.openjdk.org/jdk/pull/25549/files/ed2885ff..82a11f9b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25549&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25549&range=00-01 Stats: 5 lines in 2 files changed: 0 ins; 4 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25549.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25549/head:pull/25549 PR: https://git.openjdk.org/jdk/pull/25549 From mbaesken at openjdk.org Mon Jun 2 09:11:05 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:11:05 GMT Subject: RFR: 8357155: [asan] ZGC does not work In-Reply-To: References: Message-ID: On Fri, 30 May 2025 12:18:46 GMT, Matthias Baesken wrote: > Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). > This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. > It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' > This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . I think we handle just x86_64 and ppc64 in this change. Should I adjust the subject ? Btw Axel, should I add you as contributor, makes probably sense ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929574262 From shade at openjdk.org Mon Jun 2 09:11:55 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 09:11:55 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 07:48:39 GMT, Erik ?sterlund wrote: >> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. >> >> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. >> >> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment about clobbered registers For me, straight-up inlining: __ store_heap_oop(dst, val, r10, r11, r3, decorators); ...conveys the similar message as `// Clobbers: r10, r11, r3`. But I shall not quibble. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25483#pullrequestreview-2887454204 From aboldtch at openjdk.org Mon Jun 2 09:20:51 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 2 Jun 2025 09:20:51 GMT Subject: RFR: 8357155: [asan] ZGC does not work In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 09:08:32 GMT, Matthias Baesken wrote: > I think we handle just x86_64 and ppc64 in this change. Should I adjust the subject ? Sounds good. We should probably make this explicit in the title. > Btw Axel, should I add you as contributor, makes probably sense ? Yeah, you can add me as a contributor. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929615698 From mbaesken at openjdk.org Mon Jun 2 09:23:56 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:23:56 GMT Subject: RFR: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25495#pullrequestreview-2887491611 From mdoerr at openjdk.org Mon Jun 2 09:23:56 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 09:23:56 GMT Subject: RFR: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25495#issuecomment-2929629790 From mdoerr at openjdk.org Mon Jun 2 09:23:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 09:23:57 GMT Subject: Integrated: 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:31:40 GMT, Martin Doerr wrote: > Simple cleanup after [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859). The old instructions are always available and don't need to be tried in `VM_Version::determine_features()`. > > On Power10: > > -------------------------------------------------------------------------------- > Decoding cpu-feature detection stub at 0x000079b9203c0380 after execution: > -------------------------------------------------------------------------------- > 0x000079b9203c0380: darn r7,1 > 0x000079b9203c0384: brw r5,r6 > 0x000079b9203c0388: blr bo=0b10100,bh=0b00[subroutine_return] > 0x000079b9203c038c: dcbz 0,r3 > 0x000079b9203c0390: blr bo=0b10100,bh=0b00[subroutine_return] > > > Also tested on older processors: On Power9, `brw` gets zeroed out. On Power8, `darn` also gets zeroed out. This pull request has now been integrated. Changeset: 612f2c0c Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/612f2c0c0b75466c60d4b54dab6aa793a810c846 Stats: 75 lines in 2 files changed: 0 ins; 71 del; 4 mod 8357981: [PPC64] Remove old instructions from VM_Version::determine_features() Reviewed-by: dbriemann, mbaesken ------------- PR: https://git.openjdk.org/jdk/pull/25495 From aboldtch at openjdk.org Mon Jun 2 09:24:52 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 2 Jun 2025 09:24:52 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 09:11:05 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove aarch64 from the change, adjust ppc64 Marked as reviewed by aboldtch (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25549#pullrequestreview-2887498234 From jbechberger at openjdk.org Mon Jun 2 09:28:01 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 09:28:01 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 04:28:02 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Refactoring >> - Remove convoluted native trace logic > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 119: > >> 117: _data = new_data; >> 118: _capacity = capacity; >> 119: } > > I assume there is a lock protecting this so it happens atomically? This happens before the signal handler is attached to thread. So it does happen before any parallelism is introduced on thread creation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120557327 From mbaesken at openjdk.org Mon Jun 2 09:32:56 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:32:56 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References: Message-ID: <32poF3-6QghOwLYJ6GBMsAmGx8xcFOE9g5vqmoqzNJ0=.11438af8-f402-45e9-b74b-fcc963b2d169@github.com> On Mon, 2 Jun 2025 09:11:05 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove aarch64 from the change, adjust ppc64 contributor add xmas92 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929667209 From mbaesken at openjdk.org Mon Jun 2 09:35:51 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 2 Jun 2025 09:35:51 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 09:11:05 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove aarch64 from the change, adjust ppc64 contributor /add xmas92 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2929681609 From duke at openjdk.org Mon Jun 2 09:43:08 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 2 Jun 2025 09:43:08 GMT Subject: RFR: 8284017: Improve handshake filtering mechanism Message-ID: Hi, please consider the following enhancement: In this PR a new way of supplying multiple arguments to filter out / skip operations in handshake/safepoint poll is given. Multiple boolean arguments are combined in a hash table, where keys are taken from a new enum `HandshakeOperationProperty`, which is to be modified when there is a need for a new argument. Tested in GHA and tiers 1 - 3. ------------- Commit messages: - 8284017: Changed variable name to operation_filter. - 8284017: Added typedef. - Merge remote-tracking branch 'origin/master' into JDK-8284017-handshake-filtering - 8284017: Added missed include statement. - 8284017: Changed to enum class for filter operation value. - 8284017: Added resource mark.s - 8284017: Combined bool params into resourceHashTable for filtering Changes: https://git.openjdk.org/jdk/pull/25497/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25497&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8284017 Stats: 66 lines in 9 files changed: 38 ins; 2 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/25497.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25497/head:pull/25497 PR: https://git.openjdk.org/jdk/pull/25497 From mdoerr at openjdk.org Mon Jun 2 09:47:28 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 09:47:28 GMT Subject: RFR: 8358013: [PPC64] VSX has poor performance on Power8 [v3] In-Reply-To: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: > Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. > > Note: Power8 is an old processor and performance optimizations for it are no longer planned. Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge remote-tracking branch 'origin' into PPC64_disable_SuperwordUseVSX_Power8 - Improve description of 8358013: [PPC64] VSXSuperwordUseVSX. - 8358013: [PPC64] VSX has poor performance on Power8 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25514/files - new: https://git.openjdk.org/jdk/pull/25514/files/1f8b0e91..599a4f36 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=01-02 Stats: 32865 lines in 385 files changed: 12812 ins; 12713 del; 7340 mod Patch: https://git.openjdk.org/jdk/pull/25514.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25514/head:pull/25514 PR: https://git.openjdk.org/jdk/pull/25514 From eosterlund at openjdk.org Mon Jun 2 10:11:53 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 10:11:53 GMT Subject: RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 07:48:39 GMT, Erik ?sterlund wrote: >> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. >> >> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. >> >> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment about clobbered registers Thanks for the reviews everyone! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25483#issuecomment-2929825222 From mgronlun at openjdk.org Mon Jun 2 10:21:50 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 10:21:50 GMT Subject: RFR: 8358205: Remove unused JFR array allocation code In-Reply-To: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> References: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> Message-ID: On Fri, 30 May 2025 18:10:07 GMT, Coleen Phillimore wrote: > The JFR code is using ObjArray->allocate() directly rather than going through oopFactory. In Valhalla, the oopFactory code is being changed to account for new array shapes and attributes, so all code should call that instead. Turns out this function is unused, so this change removes it. Tested with tier1-7 with a ShouldNotReachHere(), then jdk/jfr tests with the removal. Marked as reviewed by mgronlun (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25553#pullrequestreview-2887708018 From mdoerr at openjdk.org Mon Jun 2 10:40:54 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 10:40:54 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 14:37:56 GMT, Axel Boldt-Christmas wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> remove aarch64 from the change, adjust ppc64 > > src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp line 95: > >> 93: const size_t max_address_offset_bits = valid_max_address_offset_bits - 3; >> 94: #ifdef ADDRESS_SANITIZER >> 95: return max_address_offset_bits; > > I think this actually has to be > ```c++ > return MIN2(valid_max_address_offset_bits, 44); > > > Because the way we probe we may otherwise return 45 here. Which could result in more than 44 bits in a ZOffset which our internal data structures cannot handle. Hopefully this still works for ASAN on PPC. (The `-3` is a left over from non-generational ZGC). Aarch64 could do the same, but it does not have this issue as it starts its probing at bit 46, not bit 47. > > _Side note: This makes me realise that there probably is a bug here on PPC and RISCV if running on a NUMA machine with more than 8 TB heap. As after ZGlobalsPointers::min_address_offset_request() was introduced we can return 45 from this function._ @xmas92: Thanks for looking into this! Should we set `DEFAULT_MAX_ADDRESS_BIT = 44` and use the constant? Or maybe file a separate issue for fixing that on aarch64, PPC64 and riscv? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2120738138 From epeter at openjdk.org Mon Jun 2 10:50:54 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 10:50:54 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:15:22 GMT, Xiaohong Gong wrote: >>> @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> >>> Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> >>> https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> >>> I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> >> >>> > Yes, I also observed such regression. >>> > It would be nice if you proactively mentioned regressions, so it does not have to be pointed out by reviewers. >>> >>> For me, it could be ok to fix it in a follow-up patch. I think we are too close to RDP1 for JDK25 now anyway, and so we could push this patch here into JDK26, and then we have enough time in JDK26 to investigate the regression. Even better would be if we could do the other patch first, so we never even encounter a regression. >> >> Sounds good to me. Thanks! > >> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >> > >> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! > > Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! @XiaohongGong I reviewed https://github.com/openjdk/jdk/pull/25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2930007655 From ayang at openjdk.org Mon Jun 2 10:51:06 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 2 Jun 2025 10:51:06 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v9] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <-mRIrbyrBpxq1lZ2tfcxIuxRLh5lcoURlM-woAXM45k=.7c152a76-e34f-42ba-b9a7-323102b19371@github.com> > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to 25/26 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - merge - merge-fix - merge - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - review - ... and 2 more: https://git.openjdk.org/jdk/compare/83cb0c6d...08bc74e1 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=08 Stats: 4375 lines in 31 files changed: 522 ins; 3454 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From mgronlun at openjdk.org Mon Jun 2 11:06:31 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:06:31 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame [v2] In-Reply-To: References: Message-ID: > Greetings, > > Please see the JIRA issue for a detailed description. > > Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). > > Testing: jdk_jfr, JVMTI PopFrame tests > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: more precise comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25571/files - new: https://git.openjdk.org/jdk/pull/25571/files/b48c0635..70f75414 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25571&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25571&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25571.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25571/head:pull/25571 PR: https://git.openjdk.org/jdk/pull/25571 From mgronlun at openjdk.org Mon Jun 2 11:26:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:26:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: <45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> On Mon, 2 Jun 2025 08:58:28 GMT, Johannes Bechberger wrote: >> How so? > > Because we need to add the threads from the signal handler. So any kind of growing array or set would not work, especially if we want to remove the threads from within the signal handler again. > > This is certainly an area of future optimization, albeit this doesn't seem to have any measurable performance impact in my renaissance benchmark runs. I don't understand what allocation has to do with anything. I'm talking about code branch layout to avoid having to test "has_cpu_time_jfr_requests()" when we know it will be false by default. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120846868 From mgronlun at openjdk.org Mon Jun 2 11:28:59 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:28:59 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: <7Cy88EZJj1ZgHXaAoCY9m1PnB6UAGDJxgK9PI3BVYBQ=.a4fbad7a-19fa-4e1e-999e-8773d2fd7fb1@github.com> On Mon, 2 Jun 2025 09:02:05 GMT, Johannes Bechberger wrote: >> I see. With a bounded queue as used in this solution, it can work quite nicely, that is, if the thread is actually on CPU in native, and just not waiting - if waiting (which is most likely) then pending requests could take a long time to be sent to consumers. >> >> I also understand better the optimization you tried as part of async walk in native and frames. Also quite nice, to walk from the last JfrSampleRequest and do equals to "batch" the top JFR sample requests that are the same (i,.e taken for the ljf). Maybe you can retry that again, but then you need to save the sid AND the tid to be reused for the top equal requests (you only need stacktrace.record_inner() for one request). Its a nice optimization. > > The problem is when in between queue processing a new JFR chunk is started. This caused problems before. > > I would leave these kinds of optimizations for later. Then I would recommend you drain immediately when the thread is in native, not waiting for the queue to fill up to 2/3. The reason is because the solution is based on CPU time samples and most threads that are _thread_in_native are waiting (i.e. they will not get their queues filled while in native). I would recommend dropping the second clause about testing the queue size altogether. That way you will not get threads stuck with a lot of events a long time in native, not being delivered. Revive it later when you begin to attack the optimizations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120855119 From jbechberger at openjdk.org Mon Jun 2 11:32:27 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:32:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v27] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Tiny fixes - Minor changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/439763a3..6a83d759 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=25-26 Stats: 90 lines in 9 files changed: 24 ins; 29 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Mon Jun 2 11:40:00 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:40:00 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: <45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> References: <45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> Message-ID: On Mon, 2 Jun 2025 11:22:45 GMT, Markus Gr?nlund wrote: >> Because we need to add the threads from the signal handler. So any kind of growing array or set would not work, especially if we want to remove the threads from within the signal handler again. >> >> This is certainly an area of future optimization, albeit this doesn't seem to have any measurable performance impact in my renaissance benchmark runs. > > I don't understand what allocation has to do with anything. I'm talking about code branch layout to avoid having to test "has_cpu_time_jfr_requests()" when we know it will be false by default. Ah. Sorry. Is it about reading the atomic boolean flag again? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120882396 From mgronlun at openjdk.org Mon Jun 2 11:40:02 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:40:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v27] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 11:32:27 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Tiny fixes > - Minor changes src/hotspot/share/runtime/thread.hpp line 59: > 57: class SafeThreadsListPtr; > 58: class ThreadClosure; > 59: class ThreadCrashProtection; Should not be needed. src/jdk.jfr/share/classes/jdk/jfr/internal/JVM.java line 276: > 274: * Set the maximum event emission rate for the CPU time sampler > 275: * > 276: * Setting rate to 0 turns off the CPU time method sampler. "CPU time method sampler" -> "CPU time sampler" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120878701 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120882161 From jbechberger at openjdk.org Mon Jun 2 11:51:26 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:51:26 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v28] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with three additional commits since the last revision: - Remove header includes - Always trigger async processing - Remove one atomic read ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/6a83d759..e482ad37 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=26-27 Stats: 21 lines in 6 files changed: 3 ins; 6 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Mon Jun 2 11:51:27 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:51:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v27] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 11:32:27 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Tiny fixes > - Minor changes src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 35: > 33: > 34: #include "jfr/recorder/jfrRecorder.hpp" > 35: #include "jfr/recorder/service/jfrRecorderService.hpp" The two includes above are not needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120890097 From mgronlun at openjdk.org Mon Jun 2 11:51:27 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 11:51:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: <45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> Message-ID: On Mon, 2 Jun 2025 11:37:23 GMT, Johannes Bechberger wrote: >> I don't understand what allocation has to do with anything. I'm talking about code branch layout to avoid having to test "has_cpu_time_jfr_requests()" when we know it will be false by default. > > Ah. Sorry. Is it about reading the atomic boolean flag again? Right. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120897042 From jbechberger at openjdk.org Mon Jun 2 11:51:27 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 11:51:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: <45mCuuxToelhOdhbJlap5NCUMfgDBrVGIUDGJHAk2Rg=.1dd9d5a6-f2b5-4214-8815-d0a9f0cbddbb@github.com> Message-ID: On Mon, 2 Jun 2025 11:43:54 GMT, Markus Gr?nlund wrote: >> Ah. Sorry. Is it about reading the atomic boolean flag again? > > Right. I pass it through now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120906973 From coleenp at openjdk.org Mon Jun 2 11:54:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 2 Jun 2025 11:54:00 GMT Subject: RFR: 8358205: Remove unused JFR array allocation code In-Reply-To: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> References: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> Message-ID: On Fri, 30 May 2025 18:10:07 GMT, Coleen Phillimore wrote: > The JFR code is using ObjArray->allocate() directly rather than going through oopFactory. In Valhalla, the oopFactory code is being changed to account for new array shapes and attributes, so all code should call that instead. Turns out this function is unused, so this change removes it. Tested with tier1-7 with a ShouldNotReachHere(), then jdk/jfr tests with the removal. Thank you for reviewing, Kim and Markus. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25553#issuecomment-2930287718 From coleenp at openjdk.org Mon Jun 2 11:54:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 2 Jun 2025 11:54:00 GMT Subject: Integrated: 8358205: Remove unused JFR array allocation code In-Reply-To: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> References: <4iPujAp0lL_pVhcjlfMX42dIqE7Aw5X8FZr2k5cSFGo=.139bdd20-c798-4335-9ebd-cf0748e7d339@github.com> Message-ID: On Fri, 30 May 2025 18:10:07 GMT, Coleen Phillimore wrote: > The JFR code is using ObjArray->allocate() directly rather than going through oopFactory. In Valhalla, the oopFactory code is being changed to account for new array shapes and attributes, so all code should call that instead. Turns out this function is unused, so this change removes it. Tested with tier1-7 with a ShouldNotReachHere(), then jdk/jfr tests with the removal. This pull request has now been integrated. Changeset: c22af0c2 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/c22af0c29ea89857c5cf57dd127b5c739130b2f1 Stats: 50 lines in 5 files changed: 0 ins; 45 del; 5 mod 8358205: Remove unused JFR array allocation code Reviewed-by: kbarrett, mgronlun ------------- PR: https://git.openjdk.org/jdk/pull/25553 From eosterlund at openjdk.org Mon Jun 2 12:23:51 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 12:23:51 GMT Subject: RFR: 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value In-Reply-To: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> References: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> Message-ID: On Mon, 2 Jun 2025 08:55:02 GMT, Axel Boldt-Christmas wrote: > The way that ZPlatformAddressOffsetBits is implemented on riscv and ppc may result in a return value of 45. This is larger than the max supported value of 44 (because of other internal data structures). This was fixed in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) for aarch64. > > Before [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) the issue on manifested if one tried to select a heap larger than 16 TB (not supported), but after [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) we try to double the heap address space when running on a NUMA machine. So we may now encounter this bug for heaps larger than 8TB (which is supported). > > While ZPlatformAddressOffsetBits needs an overhaul. (It was written for non-generational ZGC where we had the three color bits inside the address.) The proposal is that we solve this for ppc and riscv by doing the same thing we did for aarch64 in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) Looks reasonable. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25578#pullrequestreview-2888145463 From eosterlund at openjdk.org Mon Jun 2 12:28:58 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 12:28:58 GMT Subject: Integrated: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent In-Reply-To: References: Message-ID: On Wed, 28 May 2025 08:49:17 GMT, Erik ?sterlund wrote: > The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken. > > My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits. > > This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times. This pull request has now been integrated. Changeset: 83b15da2 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/83b15da2eb3cb6c8937f517c9b75eaa9eeece314 Stats: 8 lines in 1 file changed: 4 ins; 0 del; 4 mod 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent Reviewed-by: shade, aph, fbredberg ------------- PR: https://git.openjdk.org/jdk/pull/25483 From rvansa at openjdk.org Mon Jun 2 13:09:31 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 13:09:31 GMT Subject: RFR: 8352075: Perf regression accessing fields [v18] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <5wG8n_0XjBYjFprdBfdLMIj17sBHnJEtPdBdbi-5yxg=.6896113b-ef76-4a5b-973c-3c286554205f@github.com> > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: - Rename pivot -> key, payload -> value, add comments - Add gtest ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/c592ea59..456e1505 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=16-17 Stats: 193 lines in 4 files changed: 131 ins; 5 del; 57 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From shade at openjdk.org Mon Jun 2 13:10:58 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 13:10:58 GMT Subject: Integrated: 8357481: Excessive CompileTask wait/notify monitor creation In-Reply-To: References: Message-ID: On Wed, 21 May 2025 18:40:24 GMT, Aleksey Shipilev wrote: > See bug for rationale. > > This PR implements the 2nd solution from the bug: lift the lock to be global. As described in the bug, excess locking work would realistically affect Xcomp, and only in a minor way. But we will reap a minor footprint/latency benefit by not constructing the lock for every `CompileTask`. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [x] Linux x86_64 server fastdebug, `all` > - [x] Linux AArch64 server fastdebug, `all` This pull request has now been integrated. Changeset: b3594c9e Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/b3594c9e5508101a39d10099830f04b0c09ad41f Stats: 26 lines in 5 files changed: 5 ins; 10 del; 11 mod 8357481: Excessive CompileTask wait/notify monitor creation Reviewed-by: vlivanov, kvn ------------- PR: https://git.openjdk.org/jdk/pull/25364 From shade at openjdk.org Mon Jun 2 13:10:57 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 13:10:57 GMT Subject: RFR: 8357481: Excessive CompileTask wait/notify monitor creation [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 19:27:46 GMT, Aleksey Shipilev wrote: >> See bug for rationale. >> >> This PR implements the 2nd solution from the bug: lift the lock to be global. As described in the bug, excess locking work would realistically affect Xcomp, and only in a minor way. But we will reap a minor footprint/latency benefit by not constructing the lock for every `CompileTask`. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [x] Linux x86_64 server fastdebug, `all` >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8357481-compile-task-lock > - Merge branch 'master' into JDK-8357481-compile-task-lock > - Fix Thanks for testing! I remerged locally with current master, ran `tier1` and `compiler` tests, and there are no troubles. So I am integrating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25364#issuecomment-2930637724 From rvansa at openjdk.org Mon Jun 2 13:19:50 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 13:19:50 GMT Subject: RFR: 8352075: Perf regression accessing fields [v19] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Add gtests for number of bytes used ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/456e1505..e214a8ec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=17-18 Stats: 36 lines in 1 file changed: 35 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Mon Jun 2 13:31:57 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 13:31:57 GMT Subject: RFR: 8352075: Perf regression accessing fields [v19] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 2 Jun 2025 13:19:50 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Add gtests for number of bytes used Fixed the CI failure, and added a gtest for all allowed bit widths and sizes of table from 0 to 99 and 10000. For better testability and reusability (how do I allocate the Array without a classloader?) I've replaced this with pointer + length argument. @rose00 While your suggestion makes sense, when there's a working implementation I would leave it this way for now and leave reading with a different offset up for future improvement: we can have a microbenchmark that would justify this. I would guess that CPU caches would hide multiple memory accesses, and the loop would be unrolled (maybe to even form 4-byte access instead of 4 1-byte...). Also when not using `Array` we can no longer rely on having the 4-byte header. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2930732550 From jbechberger at openjdk.org Mon Jun 2 13:50:49 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 13:50:49 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix bug related to async stack walking ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/e482ad37..09ca4fed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=28 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=27-28 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mdoerr at openjdk.org Mon Jun 2 13:56:31 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 13:56:31 GMT Subject: RFR: 8358013: [PPC64] VSX has poor performance on Power8 [v4] In-Reply-To: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: > Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. > > Note: Power8 is an old processor and performance optimizations for it are no longer planned. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Disable some IR rules for Power8. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25514/files - new: https://git.openjdk.org/jdk/pull/25514/files/599a4f36..2014cb21 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=02-03 Stats: 4 lines in 2 files changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25514.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25514/head:pull/25514 PR: https://git.openjdk.org/jdk/pull/25514 From mdoerr at openjdk.org Mon Jun 2 14:14:29 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 14:14:29 GMT Subject: RFR: 8358013: [PPC64] VSX has poor performance on Power8 [v5] In-Reply-To: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: > Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. > > Note: Power8 is an old processor and performance optimizations for it are no longer planned. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Beautify @requires statement. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25514/files - new: https://git.openjdk.org/jdk/pull/25514/files/2014cb21..77da0573 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25514&range=03-04 Stats: 4 lines in 1 file changed: 1 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25514.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25514/head:pull/25514 PR: https://git.openjdk.org/jdk/pull/25514 From mgronlun at openjdk.org Mon Jun 2 14:52:03 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 14:52:03 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:50:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix bug related to async stack walking src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 349: > 347: const frame top_frame = thread->last_frame(); > 348: bool in_continuation = is_in_continuation(top_frame, thread); > 349: for (u4 i = 0; i < queue.size(); i++) { Realized this drainage is entirely wrong! You are not using the sample requests in the queue to build individual stack traces for events; instead, you are using the same top frame (the last Java frame) for all of them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121391177 From jbechberger at openjdk.org Mon Jun 2 15:04:13 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 15:04:13 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 14:57:22 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 349: >> >>> 347: const frame top_frame = thread->last_frame(); >>> 348: bool in_continuation = is_in_continuation(top_frame, thread); >>> 349: for (u4 i = 0; i < queue.size(); i++) { >> >> Realized this drainage is entirely wrong! >> >> You are not using the sample requests in the queue to build individual stack traces for events; instead, you are using the same top frame (the last Java frame) for all of them. > > Can I export compute_top_frame and use it here? Or just create a `Jfr::drain_cpu_time_queue` method? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121426469 From jbechberger at openjdk.org Mon Jun 2 15:04:13 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Mon, 2 Jun 2025 15:04:13 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 14:57:47 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix bug related to async stack walking > > src/hotspot/share/jfr/jfr.inline.hpp line 41: > >> 39: inline void Jfr::check_and_process_sample_request(JavaThread* jt) { >> 40: JfrThreadLocal* tl = jt->jfr_thread_local(); >> 41: bool has_cpu_time_sample_request = tl->has_cpu_time_jfr_requests(); > > Why this change? So I don't read the ` tl->has_cpu_time_jfr_requests()` twice on the hot-path > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 349: > >> 347: const frame top_frame = thread->last_frame(); >> 348: bool in_continuation = is_in_continuation(top_frame, thread); >> 349: for (u4 i = 0; i < queue.size(); i++) { > > Realized this drainage is entirely wrong! > > You are not using the sample requests in the queue to build individual stack traces for events; instead, you are using the same top frame (the last Java frame) for all of them. Can I export compute_top_frame and use it here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121424752 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121413413 From mgronlun at openjdk.org Mon Jun 2 15:04:12 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 15:04:12 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:50:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix bug related to async stack walking src/hotspot/share/jfr/jfr.inline.hpp line 41: > 39: inline void Jfr::check_and_process_sample_request(JavaThread* jt) { > 40: JfrThreadLocal* tl = jt->jfr_thread_local(); > 41: bool has_cpu_time_sample_request = tl->has_cpu_time_jfr_requests(); Why this change? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 191: > 189: void sample_thread(JfrSampleRequest& request, void* ucontext, JavaThread* jt, JfrThreadLocal* tl); > 190: > 191: // process the queues for all threads that are in native state (and requested to be sampled) "requested to be processed" I guess. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 270: > 268: void JfrCPUTimeThreadSampler::enroll() { > 269: if (Atomic::cmpxchg(&_disenrolled, true, false)) { > 270: log_info(jfr)("Enrolling CPU thread sampler"); log_trace, please. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 279: > 277: void JfrCPUTimeThreadSampler::disenroll() { > 278: if (!Atomic::cmpxchg(&_disenrolled, false, true)) { > 279: log_info(jfr)("Disenrolling CPU thread sampler"); log_trace, please. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121414317 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121416556 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121426574 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121428073 From mgronlun at openjdk.org Mon Jun 2 15:12:04 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 15:12:04 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: <_CAWRT6nKdljf9SDRnD-SfdXP9L9S6Y9f6I1nGB-4q8=.eb524157-7c00-4f01-8d8a-9e9c60ef4dc7@github.com> On Mon, 2 Jun 2025 15:01:39 GMT, Johannes Bechberger wrote: >> Can I export compute_top_frame and use it here? > > Or just create a `Jfr::drain_cpu_time_queue` method? Try to move the entire: void JfrCPUTimeThreadSampler::stackwalk_thread_in_native(JavaThread* thread) { } Into JfrThreadSampling.hpp / jfrThreadSampling.cpp - you can send your JfrCPUTimeThreadSampler events from there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121456081 From matsaave at openjdk.org Mon Jun 2 15:14:32 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 2 Jun 2025 15:14:32 GMT Subject: RFR: 8357576: FieldInfo::_index is not initialized by the constructor [v2] In-Reply-To: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> References: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> Message-ID: > FieldInfo::_index is not initialized in either of the FieldInfo constructors so this patch adds initialization to both constructors. Verified with tier 1-5 tests Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Updated copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25554/files - new: https://git.openjdk.org/jdk/pull/25554/files/e059e29a..c40a1222 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25554&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25554&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25554.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25554/head:pull/25554 PR: https://git.openjdk.org/jdk/pull/25554 From mgronlun at openjdk.org Mon Jun 2 15:18:11 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 15:18:11 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:01:15 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/jfr.inline.hpp line 41: >> >>> 39: inline void Jfr::check_and_process_sample_request(JavaThread* jt) { >>> 40: JfrThreadLocal* tl = jt->jfr_thread_local(); >>> 41: bool has_cpu_time_sample_request = tl->has_cpu_time_jfr_requests(); >> >> Why this change? > > So I don't read the ` tl->has_cpu_time_jfr_requests()` twice on the hot-path Ok, for now. We should try to come up with a better split. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121466027 From mgronlun at openjdk.org Mon Jun 2 15:18:12 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 15:18:12 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: <_CAWRT6nKdljf9SDRnD-SfdXP9L9S6Y9f6I1nGB-4q8=.eb524157-7c00-4f01-8d8a-9e9c60ef4dc7@github.com> References: <_CAWRT6nKdljf9SDRnD-SfdXP9L9S6Y9f6I1nGB-4q8=.eb524157-7c00-4f01-8d8a-9e9c60ef4dc7@github.com> Message-ID: On Mon, 2 Jun 2025 15:09:30 GMT, Markus Gr?nlund wrote: >> Or just create a `Jfr::drain_cpu_time_queue` method? > > Try to move the entire: > > void JfrCPUTimeThreadSampler::stackwalk_thread_in_native(JavaThread* thread) { > } > > Into JfrThreadSampling.hpp / jfrThreadSampling.cpp - you can send your JfrCPUTimeThreadSampler events from there. Of course, rename the routine to something appropriate. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121469433 From coleenp at openjdk.org Mon Jun 2 15:19:56 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 2 Jun 2025 15:19:56 GMT Subject: RFR: 8357576: FieldInfo::_index is not initialized by the constructor [v2] In-Reply-To: References: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> Message-ID: On Mon, 2 Jun 2025 15:14:32 GMT, Matias Saavedra Silva wrote: >> FieldInfo::_index is not initialized in either of the FieldInfo constructors so this patch adds initialization to both constructors. Verified with tier 1-5 tests > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Updated copyright Looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25554#pullrequestreview-2888908353 From mgronlun at openjdk.org Mon Jun 2 15:22:03 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 15:22:03 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:50:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix bug related to async stack walking src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 250: > 248: break; > 249: } else { > 250: biased = false; Not correct. There is a top_frame = *current - >biased = true below. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121482514 From rvansa at openjdk.org Mon Jun 2 15:31:46 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 2 Jun 2025 15:31:46 GMT Subject: RFR: 8352075: Perf regression accessing fields [v20] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Fix error on windows ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/e214a8ec..7d8b4a19 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=18-19 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From matsaave at openjdk.org Mon Jun 2 15:31:58 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 2 Jun 2025 15:31:58 GMT Subject: RFR: 8357576: FieldInfo::_index is not initialized by the constructor [v2] In-Reply-To: References: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> Message-ID: On Sat, 31 May 2025 03:19:19 GMT, SendaoYan wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Updated copyright > > Should we update the copyright year to 2025 Thank you @sendaoYan @coleenp and @dholmes-ora for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25554#issuecomment-2931262041 From matsaave at openjdk.org Mon Jun 2 15:31:59 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 2 Jun 2025 15:31:59 GMT Subject: Integrated: 8357576: FieldInfo::_index is not initialized by the constructor In-Reply-To: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> References: <_9Nvx68w_0Ly5NgPGzGci6Uf9Si0AM1N3eQ_e-5hBR8=.1f055ae3-8cd7-4ae2-ae17-3722dc4b7427@github.com> Message-ID: On Fri, 30 May 2025 19:07:24 GMT, Matias Saavedra Silva wrote: > FieldInfo::_index is not initialized in either of the FieldInfo constructors so this patch adds initialization to both constructors. Verified with tier 1-5 tests This pull request has now been integrated. Changeset: 1b6ae205 Author: Matias Saavedra Silva URL: https://git.openjdk.org/jdk/commit/1b6ae2059b0475ec78559d2d6612f3b6ec68309f Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod 8357576: FieldInfo::_index is not initialized by the constructor Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25554 From dnsimon at openjdk.org Mon Jun 2 15:53:02 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Jun 2025 15:53:02 GMT Subject: RFR: 8358254: [AOT] runtime/cds/appcds/applications/JavacBench.java#aot crashes with SEGV in ClassLoaderData::holder Message-ID: JVMCI needs to be aware of unloaded classes in type profiles just like [CI does](https://github.com/openjdk/jdk/pull/24886/files#diff-cda53c3ed39c4e59f73f3298933ebed1912daeaf854f0b31f40332be109f6c30R317). ------------- Commit messages: - support unloaded classes in type profiles in AOT mode - convert RawItemProfile to a record Changes: https://git.openjdk.org/jdk/pull/25592/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25592&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358254 Stats: 33 lines in 4 files changed: 13 ins; 16 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25592.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25592/head:pull/25592 PR: https://git.openjdk.org/jdk/pull/25592 From duke at openjdk.org Mon Jun 2 16:36:56 2025 From: duke at openjdk.org (Mohamed Issa) Date: Mon, 2 Jun 2025 16:36:56 GMT Subject: RFR: 8358231: Template interpreter generator crashes with ShouldNotReachHere on some platforms after 8353686 [v2] In-Reply-To: References: <8Xnq0jvMBRkxOk4-gheVgeDGuIPhXXlZ8Yt-NO3izhQ=.2ff06a32-52b1-4829-9c19-0106ef733399@github.com> Message-ID: On Sun, 1 Jun 2025 17:11:05 GMT, Martin Doerr wrote: >> Trivial build fix for PPC64 and s390. Added arm32. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Add arm32 fix. @eme64 @dholmes-ora This resolves the crash discussed in #24470. @TheRealMDoerr Thank you for this. I was about to create a PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25568#issuecomment-2931518930 From duke at openjdk.org Mon Jun 2 17:04:07 2025 From: duke at openjdk.org (Mohamed Issa) Date: Mon, 2 Jun 2025 17:04:07 GMT Subject: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6] In-Reply-To: References: <1NsI0OGP9RcnbEwlJwDj1dZ3w7zCP4DxJhEmO1quSgo=.3b1e3da9-9aa9-4221-a73a-e2f3ec5f456b@github.com> Message-ID: On Mon, 2 Jun 2025 04:42:03 GMT, David Holmes wrote: > > When you say "most of the non-x86 platforms", are you referring to the ones with processor types listed below? > > Yes - 3 of the 5 non-x86 platforms. > > > It looks like aarch64 and riscv don't take that route and would fall back to the default cbrt implementation. > > I was wondering why Aarch64 didn't fail. I guess the other platforms may use this to detect new intrinsics being added. The arm, ppc, and s390 breaks are resolved by #25568. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2931613343 From mgronlun at openjdk.org Mon Jun 2 17:29:04 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 17:29:04 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:50:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix bug related to async stack walking src/hotspot/share/memory/resourceArea.hpp line 46: > 44: // A ResourceArea is an Arena that supports safe usage of ResourceMark. > 45: class ResourceArea: public Arena { > 46: Changes in this file are unrelated, so revert this entire file. src/hotspot/share/prims/forte.cpp line 575: > 573: extern "C" { > 574: JNIEXPORT > 575: void AsyncGetCallTrace(ASGCT_CallTrace *trace, jint depth, void* ucontext) { Unrelated changes, please revert file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121757461 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121757998 From dnsimon at openjdk.org Mon Jun 2 18:21:53 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Jun 2025 18:21:53 GMT Subject: RFR: 8358254: [AOT] runtime/cds/appcds/applications/JavacBench.java#aot crashes with SEGV in ClassLoaderData::holder In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:47:29 GMT, Doug Simon wrote: > JVMCI needs to be aware of unloaded classes in type profiles just like [CI does](https://github.com/openjdk/jdk/pull/24886/files#diff-cda53c3ed39c4e59f73f3298933ebed1912daeaf854f0b31f40332be109f6c30R317). src/hotspot/share/oops/trainingData.hpp line 286: > 284: static bool assembling_data() { return have_data() && CDSConfig::is_dumping_final_static_archive() && CDSConfig::is_dumping_aot_linked_classes(); } > 285: > 286: static bool is_klass_loaded(Klass* k) { This code was moved unmodified from ciMethodData.cpp. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25592#discussion_r2121856022 From shade at openjdk.org Mon Jun 2 18:46:02 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 18:46:02 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 Message-ID: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. Additional testing: - [x] Linux x86_64 server fastdebug, `runtime/cds` - [ ] Linux x86_64 server fastdebug, `tier1` - [ ] Linux x86_64 server fastdebug, `all` ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/25599/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25599&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358339 Stats: 10 lines in 3 files changed: 10 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25599.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25599/head:pull/25599 PR: https://git.openjdk.org/jdk/pull/25599 From shade at openjdk.org Mon Jun 2 18:47:28 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 18:47:28 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:50:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix bug related to async stack walking Scanned this briefly, would do another pass tomorrow. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 117: > 115: > 116: bool JfrCPUTimeTraceQueue::is_empty() const { > 117: return Atomic::load(&_head) == 0; Not entirely clear what is the memory semantics for accessing `_head`. Does it need to be acq/rel? If so, this one should be `::load_acquire`? src/hotspot/share/memory/resourceArea.hpp line 46: > 44: // A ResourceArea is an Arena that supports safe usage of ResourceMark. > 45: class ResourceArea: public Arena { > 46: All the changes in this file are unnecessary, please revert. src/jdk.jfr/share/classes/jdk/jfr/internal/JVM.java line 281: > 279: * @param autoadapt true if the rate should be adapted automatically > 280: */ > 281: public static native void setCPUThrottle(double rate, boolean autoadapt); Suggestion: public static native void setCPUThrottle(double rate, boolean autoAdapt); test/jdk/jdk/jfr/event/profiling/TestSamplingLongPeriod.java line 42: > 40: public class TestSamplingLongPeriod { > 41: > 42: static String sampleEvent = EventNames.ExecutionSample; Does not look necessary to change? ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2888004951 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121900364 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121610476 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121587105 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2121584954 From shade at openjdk.org Mon Jun 2 18:47:28 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 18:47:28 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v24] In-Reply-To: References: <-QiSWEqppeW60aedVbLA3WTmnba7Fry53Qr86wE2EPs=.7a6327ce-7ef0-4b1c-bc68-0421ba3fd46f@github.com> Message-ID: On Sun, 1 Jun 2025 07:19:54 GMT, Johannes Bechberger wrote: >> Thanks for catching this mistake. I'll fix it this afternoon. > > I fixed it by changing the JEP. Hold on, shouldn't this really be "Lost"? @egahlin and @mgronlun need to chime in here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120893338 From shade at openjdk.org Mon Jun 2 18:47:30 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 18:47:30 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v27] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 11:32:27 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Tiny fixes > - Minor changes src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 30: > 28: #include "runtime/orderAccess.hpp" > 29: #include "utilities/ticks.hpp" > 30: #include "jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp" Include order? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 60: > 58: assert(raw_thread->is_Java_thread(), "invariant"); > 59: JavaThread* jt; > 60: if ((jt = JavaThread::cast(raw_thread))->is_exiting()) { I see no point to be extra-smart with inline assignments here: Suggestion: JavaThread* jt = JavaThread::cast(raw_thread); if (jt->is_exiting()) { src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 115: > 113: JfrCPUTimeSampleRequest* new_data = JfrCHeapObj::new_array(capacity); > 114: JfrCHeapObj::free(_data, _capacity * sizeof(JfrCPUTimeSampleRequest)); > 115: _data = new_data; A bit of peak memory consumption improvement: don't have two things live at once. Plus, give the native allocator a chance to reuse the same location. Suggestion: JfrCHeapObj::free(_data, _capacity * sizeof(JfrCPUTimeSampleRequest)); _data = JfrCHeapObj::new_array(capacity); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120895107 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120897472 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2120909443 From shade at openjdk.org Mon Jun 2 18:51:51 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 18:51:51 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [ ] Linux x86_64 server fastdebug, `tier1` > - [ ] Linux x86_64 server fastdebug, `all` Actually, I am not sure if it is even a bug, because mainline is using `MethodCounters::method()` any reasonably only in `MethodCounters::metaspace_pointers_do()`. But I guess it would be good to make sure we handle this backlink consistently. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25599#issuecomment-2932037261 From cjplummer at openjdk.org Mon Jun 2 19:07:54 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 2 Jun 2025 19:07:54 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <9JQNK3tYLfg04pRpUiGpPYWoSunSfqWB61lkLxSPxwk=.a781defd-ea0e-4ebf-aa7f-01fff2e63101@github.com> On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan Can you document why each tests fails so we have it on record? Can be done in the PR or the CR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2932080104 From dcubed at openjdk.org Mon Jun 2 19:41:54 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 2 Jun 2025 19:41:54 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 08:21:34 GMT, Kim Barrett wrote: >> Please review this change to permit the use of `noexcept` under certain >> circumstances in HotSpot code. >> >> http://wg21.link/n3050 >> >> Testing: >> >> JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the >> conversion would look like. It will need to be brought up to current mainline, >> possibly with modifications. >> >> This is a modification of the Style Guide, so rough consensus among the >> HotSpot Group members is required to make this change. Only Group members >> should vote for approval (via the github PR), though reasoned objections or >> comments from anyone will be considered. A decision on this proposal will not >> be made before Friday 16-June-2025 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process >> to approve (click on Review Changes > Approve), rather than sending a "vote: >> yes" email reply that would be normal for a CFV. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > dholmes review Thumbs up. I do have a query about whether the mention of `nothrow` should be `noexcept`. doc/hotspot-style.html line 1153: > 1151: different guarantees for some operations (and may choose different > 1152: algorithms to implement those operations), depending on whether certain > 1153: functions (constructors, copy/move operations, swap) are nothrow or not. `nothrow` here or `noexcept`? doc/hotspot-style.md line 1145: > 1143: guarantees for some operations (and may choose different algorithms to > 1144: implement those operations), depending on whether certain functions > 1145: (constructors, copy/move operations, swap) are nothrow or not. They detect `nothrow` here or `noexcept`? ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25574#pullrequestreview-2889700427 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2122001750 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2122004882 From never at openjdk.org Mon Jun 2 20:05:51 2025 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 2 Jun 2025 20:05:51 GMT Subject: RFR: 8358254: [AOT] runtime/cds/appcds/applications/JavacBench.java#aot crashes with SEGV in ClassLoaderData::holder In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:47:29 GMT, Doug Simon wrote: > JVMCI needs to be aware of unloaded classes in type profiles just like [CI does](https://github.com/openjdk/jdk/pull/24886/files#diff-cda53c3ed39c4e59f73f3298933ebed1912daeaf854f0b31f40332be109f6c30R317). Marked as reviewed by never (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25592#pullrequestreview-2889770600 From mgronlun at openjdk.org Mon Jun 2 20:07:02 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 2 Jun 2025 20:07:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:50:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix bug related to async stack walking src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 49: > 47: > 48: static bool is_excluded(JavaThread* thread) { > 49: return thread->is_hidden_from_external_view() || thread->jfr_thread_local()->is_excluded(); I think I misled you saying that JfrRecorder_thread would be excluded by the above expression. That was true - but not anymore. Our exclusion test looks like: static inline bool is_excluded(JavaThread* jt) { assert(jt != nullptr, "invariant"); return jt->is_Compiler_thread() || jt->is_hidden_from_external_view() || jt->is_JfrRecorder_thread() || jt->jfr_thread_local()->is_excluded(); } I like you could fold jt->is_Compiler_thread() into jt->is_hidden_from_external_view() - good!. But can you please again list the condition jt->is_JfrRecorder_thread() ? Sorry, I forgot we had removed it from being considered excluded on the JfrThreadLocal level. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2122045043 From dholmes at openjdk.org Mon Jun 2 21:10:51 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 21:10:51 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 08:21:34 GMT, Kim Barrett wrote: >> Please review this change to permit the use of `noexcept` under certain >> circumstances in HotSpot code. >> >> http://wg21.link/n3050 >> >> Testing: >> >> JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the >> conversion would look like. It will need to be brought up to current mainline, >> possibly with modifications. >> >> This is a modification of the Style Guide, so rough consensus among the >> HotSpot Group members is required to make this change. Only Group members >> should vote for approval (via the github PR), though reasoned objections or >> comments from anyone will be considered. A decision on this proposal will not >> be made before Friday 16-June-2025 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process >> to approve (click on Review Changes > Approve), rather than sending a "vote: >> yes" email reply that would be normal for a CFV. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > dholmes review Marked as reviewed by dholmes (Reviewer). doc/hotspot-style.html line 1121: > 1119:
  • Only the argument-less form of noexcept exception > 1120: specifications are permitted. noexcept exception > 1121: specifications with arguments are forbidden.
  • I was suggesting dropping the second sentence as it is implied by the first. ------------- PR Review: https://git.openjdk.org/jdk/pull/25574#pullrequestreview-2889941827 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2122157846 From dholmes at openjdk.org Mon Jun 2 21:56:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 2 Jun 2025 21:56:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 18:37:14 GMT, Aleksey Shipilev wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix bug related to async stack walking > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 117: > >> 115: >> 116: bool JfrCPUTimeTraceQueue::is_empty() const { >> 117: return Atomic::load(&_head) == 0; > > Not entirely clear what is the memory semantics for accessing `_head`. Does it need to be acq/rel? If so, this one should be `::load_acquire`? Many of the accesses to head do not appear to synchronize with anything and so do not need acquire semantics. But the overall concurrency properties of this code are very unclear to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2122228261 From coleenp at openjdk.org Tue Jun 3 00:14:57 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 3 Jun 2025 00:14:57 GMT Subject: RFR: 8352075: Perf regression accessing fields [v20] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 2 Jun 2025 15:31:46 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Fix error on windows It all seems reasonable until I got to the packing code and it'll take a long time to figure out how it works. Maybe some comments would help. I have 3 general comments though: 1. The coding style guide somewhere says that the * belongs with the type and not the name. This is inconsistent in this code. Can you fix it? 2. Block comments (except copyright) should use // not /* */ 3. The jtreg test directory name should be not the bugid. I think this test can go in directory runtime/FieldLayout. src/hotspot/share/utilities/packedTable.hpp line 38: > 36: uint32_t _key_mask; > 37: unsigned int _value_shift; > 38: uint32_t _value_mask; Aren't all 4 of these types the same? can you make them all uint32_t or all unsigned int? (former preferred). ------------- PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2890214085 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2122347635 From coleenp at openjdk.org Tue Jun 3 00:14:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 3 Jun 2025 00:14:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v20] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 2 Jun 2025 23:49:51 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix error on windows > > src/hotspot/share/utilities/packedTable.hpp line 38: > >> 36: uint32_t _key_mask; >> 37: unsigned int _value_shift; >> 38: uint32_t _value_mask; > > Aren't all 4 of these types the same? can you make them all uint32_t or all unsigned int? (former preferred). Can you explain somewhere how fields are mapped to this? I assume they're sorted, for some reason I expected the packed table to be {name-cp-index, sig-cp-index, offset-in-fieldstream-for-direct-access}. Does every field get 4 ints ? So why is it packed into ```Array``` rather than just use ```Array```? So much packing code that I don't know how anyone could ever debug it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2122360613 From dholmes at openjdk.org Tue Jun 3 00:16:03 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 3 Jun 2025 00:16:03 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 09:24:53 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 119: >> >>> 117: _data = new_data; >>> 118: _capacity = capacity; >>> 119: } >> >> I assume there is a lock protecting this so it happens atomically? > > This happens before the signal handler is attached to thread. So it does happen before any parallelism is introduced on thread creation. I'm missing the big picture here unfortunately. This looks like it can get called repeatedly as needed to change capacity. Are you saying it only gets called once before we create the sampler thread? Is the concurrency model described somewhere? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2122365626 From dholmes at openjdk.org Tue Jun 3 00:28:05 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 3 Jun 2025 00:28:05 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: <_Q0iW6TuzM0P1qeE2XsMZbTx3lfCgW9QDEsf3-FlRYE=.b6707a06-3d91-4764-a8d8-7eaa76680584@github.com> On Mon, 2 Jun 2025 21:53:38 GMT, David Holmes wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 117: >> >>> 115: >>> 116: bool JfrCPUTimeTraceQueue::is_empty() const { >>> 117: return Atomic::load(&_head) == 0; >> >> Not entirely clear what is the memory semantics for accessing `_head`. Does it need to be acq/rel? If so, this one should be `::load_acquire`? > > Many of the accesses to head do not appear to synchronize with anything and so do not need acquire semantics. But the overall concurrency properties of this code are very unclear to me. To be clear, you only need acquire semantics here if after seeing the value 0 you need to access fields that were written before `_head` was set to 0. Similarly for most of the other access to `_head`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2122374152 From dholmes at openjdk.org Tue Jun 3 00:53:57 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 3 Jun 2025 00:53:57 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan Changes look fine but I agree with Chris that we need to document why these tests don't work with ASAN, though I think I'd prefer to see an `@comment` before the `@requires !vm.asan` in the actual test files - assuming the reason can be stated clearly and succinctly. ------------- PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2890276148 From xgong at openjdk.org Tue Jun 3 01:49:07 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 3 Jun 2025 01:49:07 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:15:22 GMT, Xiaohong Gong wrote: >>> @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> >>> Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> >>> https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> >>> I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> >> >>> > Yes, I also observed such regression. >>> > It would be nice if you proactively mentioned regressions, so it does not have to be pointed out by reviewers. >>> >>> For me, it could be ok to fix it in a follow-up patch. I think we are too close to RDP1 for JDK25 now anyway, and so we could push this patch here into JDK26, and then we have enough time in JDK26 to investigate the regression. Even better would be if we could do the other patch first, so we never even encounter a regression. >> >> Sounds good to me. Thanks! > >> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >> > >> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! > > Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! > @XiaohongGong I reviewed #25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? That's fine to me. Thanks for your review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2933082670 From xgong at openjdk.org Tue Jun 3 01:49:07 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 3 Jun 2025 01:49:07 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 01:45:57 GMT, Xiaohong Gong wrote: >>> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> > >>> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >>> >>> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! > >> @XiaohongGong I reviewed #25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? > > That's fine to me. Thanks for your review! > Hi @XiaohongGong , Looks good to me, thanks again for this re-factor !! > > Best Regards, Jatin Thanks so much for your review @jatin-bhateja ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2933083694 From kvn at openjdk.org Tue Jun 3 02:06:07 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 02:06:07 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder Message-ID: There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. Testing tier1-3, xcomp, stress. Higher tiers are still running. ------------- Commit messages: - 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder Changes: https://git.openjdk.org/jdk/pull/25604/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25604&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358289 Stats: 7 lines in 2 files changed: 3 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25604/head:pull/25604 PR: https://git.openjdk.org/jdk/pull/25604 From kvn at openjdk.org Tue Jun 3 02:12:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 02:12:50 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 02:01:02 GMT, Vladimir Kozlov wrote: > There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. > > I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. > > I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. > > Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. > > Testing tier1-3, xcomp, stress. Higher tiers are still running. @MBaesken please test this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25604#issuecomment-2933119129 From asmehra at openjdk.org Tue Jun 3 03:48:50 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 3 Jun 2025 03:48:50 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 02:01:02 GMT, Vladimir Kozlov wrote: > There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. > > I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. > > I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. > > Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. > > Testing tier1-3, xcomp, stress. Higher tiers are still running. @iklam @MBaesken Nice catch. @vnkozlov thanks for fixing it. I realized `compute_size()` does not use `sig_bt` parameter. Since you are touching this code, can you please remove it as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25604#issuecomment-2933306923 From xpeng at openjdk.org Tue Jun 3 05:36:23 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 3 Jun 2025 05:36:23 GMT Subject: RFR: 8354555: Add generic JFR events for TaskTerminator [v6] In-Reply-To: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> References: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> Message-ID: > The purpose of the PR is to add generic JFR events for TaskTerminator to track the attempts and timings that GC threads have tried to terminate GC tasks. > > Today only G1 emits JFR event with name `Termination` from [G1ParEvacuateFollowersClosure](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1YoungCollector.cpp#L555-L563), all other garbage collectors don't emit any JFR event for the termination attempt at all. > > By adding this, it gives performance engineers the visibility to the termination attempts and termination time when GC threads trying to finish GC tasks, we could build tool to analyze the jfr events to determine if there is potential data structure issue in application code, e.g. very large LinkedList or LinkedBlockingQueue. > > For the test, I have manually tested different GCs with Flight Recording enabled and verified the events: > G1: > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0108 ms > gcId = 0 > gcWorkerId = 8 > name = "Termination" > eventThread = "GC Thread#4" (osThreadId = 20483) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0467 ms > gcId = 0 > gcWorkerId = 2 > name = "Termination" > eventThread = "GC Thread#2" (osThreadId = 21251) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0474 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination" > eventThread = "GC Thread#8" (osThreadId = 36359) > } > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000834 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000166 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > > Shenandoah: > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0202 ms > gcId = 0 > gcWorkerId = 0 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#3" (osThreadId = 13827) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0205 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#1" (osThreadId = 14339) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0127 ms > gcId = 0 > gcWorkerId = 5 > name = "Termination: Final Mark" > eventThread = "Shenandoah G... Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: - Merge branch 'openjdk:master' into JDK-8354555 - Merge branch 'openjdk:master' into JDK-8354555 - Fix jft test failure - Merge branch 'master' into JDK-8354555 - Patch to fix the PR concerns - Emit exact same events for G1 as G1 is emitting today from G1EvacuateRegionsBaseTask and G1STWRefProcProxyTask - Add include "workerThread.hpp" - Touch up - Move TERMINATION_EVENT_NAME_PREFIX_ASSERT to taskTerminator.cpp - Fix ident - ... and 20 more: https://git.openjdk.org/jdk/compare/832c5b06...8fb9a402 ------------- Changes: https://git.openjdk.org/jdk/pull/24676/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24676&range=05 Stats: 90 lines in 10 files changed: 68 ins; 7 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/24676.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24676/head:pull/24676 PR: https://git.openjdk.org/jdk/pull/24676 From cslucas at openjdk.org Tue Jun 3 05:40:09 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 05:40:09 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v2] In-Reply-To: References: Message-ID: > Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. > > Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Address PR feedback: modify emum to be scoped. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25338/files - new: https://git.openjdk.org/jdk/pull/25338/files/933b958d..b3bb4365 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=00-01 Stats: 83 lines in 13 files changed: 55 ins; 4 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/25338.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25338/head:pull/25338 PR: https://git.openjdk.org/jdk/pull/25338 From rvansa at openjdk.org Tue Jun 3 05:53:55 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 3 Jun 2025 05:53:55 GMT Subject: RFR: 8352075: Perf regression accessing fields [v20] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Tue, 3 Jun 2025 00:05:35 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/packedTable.hpp line 38: >> >>> 36: uint32_t _key_mask; >>> 37: unsigned int _value_shift; >>> 38: uint32_t _value_mask; >> >> Aren't all 4 of these types the same? can you make them all uint32_t or all unsigned int? (former preferred). > > Can you explain somewhere how fields are mapped to this? I assume they're sorted, for some reason I expected the packed table to be {name-cp-index, sig-cp-index, offset-in-fieldstream-for-direct-access}. Does every field get 4 ints ? So why is it packed into ```Array``` rather than just use ```Array```? So much packing code that I don't know how anyone could ever debug it. Yes, in practice these all are of the same size, but in case of the masks (as well as in case of arguments in API) I want to stress out that these are 32 bit numbers. The `unsigned int`s are just 'some not too big number'. Is there any general guidance on deciding between `unsigned int` (I suppose just `unsigned` is not recommended), `uint32_t` and `u4`? I was hoping that the comment on line 68 explains the intended use, but I can be more verbose and document each method. When the packed table is used for fieldinfo, it's { offset-in-fieldstream, index-in-fieldstream }. The Comparator implementation can translate offset-in-fieldstream -> { name, signature } and then do the comparison. The `index-in-fieldstream` is kind of second-class citizen; we need to fill it into `FieldInfo` and it is not encoded in the stream, therefore we need to encode it in the packed table. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2122780819 From aboldtch at openjdk.org Tue Jun 3 06:00:57 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 3 Jun 2025 06:00:57 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References: Message-ID: <_p5h0MfOc1LQ2g30xDHYJf9v_B2QbJmJ0El0vc_u6zM=.6461af5c-c4cd-442c-a16e-c9578484f10c@github.com> On Mon, 2 Jun 2025 10:38:18 GMT, Martin Doerr wrote: >> src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp line 95: >> >>> 93: const size_t max_address_offset_bits = valid_max_address_offset_bits - 3; >>> 94: #ifdef ADDRESS_SANITIZER >>> 95: return max_address_offset_bits; >> >> I think this actually has to be >> ```c++ >> return MIN2(valid_max_address_offset_bits, 44); >> >> >> Because the way we probe we may otherwise return 45 here. Which could result in more than 44 bits in a ZOffset which our internal data structures cannot handle. Hopefully this still works for ASAN on PPC. (The `-3` is a left over from non-generational ZGC). Aarch64 could do the same, but it does not have this issue as it starts its probing at bit 46, not bit 47. >> >> _Side note: This makes me realise that there probably is a bug here on PPC and RISCV if running on a NUMA machine with more than 8 TB heap. As after ZGlobalsPointers::min_address_offset_request() was introduced we can return 45 from this function._ > > @xmas92: Thanks for looking into this! Should we set `DEFAULT_MAX_ADDRESS_BIT = 44` and use the constant? > Or maybe file a separate issue for fixing that on aarch64, PPC64 and riscv (and also remove the -3 from the `max_address_offset_bits computation`)? [JDK-8358310](https://bugs.openjdk.org/browse/JDK-8358310) / #25578 is open right now as a quick fix for returning a too large value without cleaning up the implementation. (As a fix for 25) This was noted back in https://github.com/openjdk/jdk/pull/18941#issuecomment-2079316745 ([JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275)), but I think fixing this fell through the cracks. I currently have a rewrite in the works which overhauls the heap base selection, which I plan to get into 26. In that patch all the non-generational legacy is removed. So we no longer probe based on the assumption that we need 3 extra high order bits. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2122787593 From aboldtch at openjdk.org Tue Jun 3 06:00:58 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 3 Jun 2025 06:00:58 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: <_p5h0MfOc1LQ2g30xDHYJf9v_B2QbJmJ0El0vc_u6zM=.6461af5c-c4cd-442c-a16e-c9578484f10c@github.com> References: <_p5h0MfOc1LQ2g30xDHYJf9v_B2QbJmJ0El0vc_u6zM=.6461af5c-c4cd-442c-a16e-c9578484f10c@github.com> Message-ID: On Tue, 3 Jun 2025 05:56:46 GMT, Axel Boldt-Christmas wrote: >> @xmas92: Thanks for looking into this! Should we set `DEFAULT_MAX_ADDRESS_BIT = 44` and use the constant? >> Or maybe file a separate issue for fixing that on aarch64, PPC64 and riscv (and also remove the -3 from the `max_address_offset_bits computation`)? > > [JDK-8358310](https://bugs.openjdk.org/browse/JDK-8358310) / #25578 is open right now as a quick fix for returning a too large value without cleaning up the implementation. (As a fix for 25) > > This was noted back in https://github.com/openjdk/jdk/pull/18941#issuecomment-2079316745 ([JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275)), but I think fixing this fell through the cracks. > > I currently have a rewrite in the works which overhauls the heap base selection, which I plan to get into 26. In that patch all the non-generational legacy is removed. So we no longer probe based on the assumption that we need 3 extra high order bits. But I will make sure to create an issue for this overhaul, so it does not get lost. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2122788828 From dnsimon at openjdk.org Tue Jun 3 06:21:55 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 3 Jun 2025 06:21:55 GMT Subject: RFR: 8358254: [AOT] runtime/cds/appcds/applications/JavacBench.java#aot crashes with SEGV in ClassLoaderData::holder In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:47:29 GMT, Doug Simon wrote: > JVMCI needs to be aware of unloaded classes in type profiles just like [CI does](https://github.com/openjdk/jdk/pull/24886/files#diff-cda53c3ed39c4e59f73f3298933ebed1912daeaf854f0b31f40332be109f6c30R317). Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25592#issuecomment-2933645017 From dnsimon at openjdk.org Tue Jun 3 06:21:56 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 3 Jun 2025 06:21:56 GMT Subject: Integrated: 8358254: [AOT] runtime/cds/appcds/applications/JavacBench.java#aot crashes with SEGV in ClassLoaderData::holder In-Reply-To: References: Message-ID: <7NMm1SDt9UaKrkgEPeFaSbkz97Lwqof1TVjyAKEyGY4=.d4792765-e2e2-4e0c-8a28-b5583cfed394@github.com> On Mon, 2 Jun 2025 15:47:29 GMT, Doug Simon wrote: > JVMCI needs to be aware of unloaded classes in type profiles just like [CI does](https://github.com/openjdk/jdk/pull/24886/files#diff-cda53c3ed39c4e59f73f3298933ebed1912daeaf854f0b31f40332be109f6c30R317). This pull request has now been integrated. Changeset: 497a1822 Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/497a1822cabcc0475ce0495d56430f1e99b1fb13 Stats: 33 lines in 4 files changed: 13 ins; 16 del; 4 mod 8358254: [AOT] runtime/cds/appcds/applications/JavacBench.java#aot crashes with SEGV in ClassLoaderData::holder Reviewed-by: never ------------- PR: https://git.openjdk.org/jdk/pull/25592 From jbechberger at openjdk.org Tue Jun 3 06:58:03 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 06:58:03 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:50:49 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix bug related to async stack walking Regarding https://github.com/openjdk/jdk/pull/25302#discussion_r2119984783 raw_thread == nullptr This seems to happen rarely on (abrupt) shutdowns. I attached an hs_err file: [hs_err_pid1688961.log](https://github.com/user-attachments/files/20563594/hs_err_pid1688961.log) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2933774229 From jbechberger at openjdk.org Tue Jun 3 07:05:47 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 07:05:47 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v30] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Check for raw_thread == nullptr - Move async stackwalking to JFR ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/09ca4fed..bef52132 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=29 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=28-29 Stats: 89 lines in 3 files changed: 37 ins; 49 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Tue Jun 3 07:12:46 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 07:12:46 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v31] In-Reply-To: References: Message-ID: <_4vRA_P9_dLG022vs8ZinaZmqC48drRAwdOSiDG9Wjk=.25880197-6c87-4faf-8259-12d6c0f10f2e@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with three additional commits since the last revision: - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp Co-authored-by: Aleksey Shipil?v - Small fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/bef52132..c3dedefb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=30 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=29-30 Stats: 17 lines in 4 files changed: 2 ins; 5 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From rvansa at openjdk.org Tue Jun 3 07:16:47 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 3 Jun 2025 07:16:47 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: - Moved jtreg test - Improved documentation - Fix coding style (asterisk placement) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/7d8b4a19..862b264b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=19-20 Stats: 99 lines in 11 files changed: 31 ins; 2 del; 66 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From jbechberger at openjdk.org Tue Jun 3 07:17:09 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 07:17:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 20:02:15 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix bug related to async stack walking > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 49: > >> 47: >> 48: static bool is_excluded(JavaThread* thread) { >> 49: return thread->is_hidden_from_external_view() || thread->jfr_thread_local()->is_excluded(); > > I think I misled you saying that JfrRecorder_thread would be excluded by the above expression. That was true - but not anymore. > > Our exclusion test looks like: > > static inline bool is_excluded(JavaThread* jt) { > assert(jt != nullptr, "invariant"); > return jt->is_Compiler_thread() || jt->is_hidden_from_external_view() || jt->is_JfrRecorder_thread() || jt->jfr_thread_local()->is_excluded(); > } > > I like you could fold jt->is_Compiler_thread() into jt->is_hidden_from_external_view() - good!. > > But can you please again list the condition jt->is_JfrRecorder_thread()? Sorry, I forgot we had removed it from being considered excluded on a per JfrThreadLocal level. Thanks. No problem. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2122934873 From rvansa at openjdk.org Tue Jun 3 07:18:58 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 3 Jun 2025 07:18:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v15] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Sat, 31 May 2025 14:49:48 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> More debug logs > > I can reproduce the crash when building slowdebug on linux-x64. @coleenp I fixed the coding style (I wish OpenJDK had a linter, or at least a checker... the asterisk placement is hard to get used to), improved docs and moved the jtreg test to runtime/FieldStream (I think that FieldLayout checks how are fields placed within an instance). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2933845743 From lliu at openjdk.org Tue Jun 3 07:19:11 2025 From: lliu at openjdk.org (Liming Liu) Date: Tue, 3 Jun 2025 07:19:11 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU Message-ID: This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. The performance regressions and improvements were measured with the following microbenchmarks: org.openjdk.bench.java.util.TestCRC32.testCRC32Update org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate Ran the following JTReg tests on Ampere1 and did not find problems: test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java ------------- Commit messages: - Use the utility functions - Introduce CryptoPmullForCRC32LowLimit and use pmull for crc32 on Ampere CPU Changes: https://git.openjdk.org/jdk/pull/25609/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358032 Stats: 28 lines in 3 files changed: 17 ins; 3 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/25609.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25609/head:pull/25609 PR: https://git.openjdk.org/jdk/pull/25609 From dholmes at openjdk.org Tue Jun 3 07:39:02 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 3 Jun 2025 07:39:02 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v31] In-Reply-To: <_4vRA_P9_dLG022vs8ZinaZmqC48drRAwdOSiDG9Wjk=.25880197-6c87-4faf-8259-12d6c0f10f2e@github.com> References: <_4vRA_P9_dLG022vs8ZinaZmqC48drRAwdOSiDG9Wjk=.25880197-6c87-4faf-8259-12d6c0f10f2e@github.com> Message-ID: On Tue, 3 Jun 2025 07:12:46 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp > > Co-authored-by: Aleksey Shipil?v > - Small fixes > Regarding [#25302 (comment)](https://github.com/openjdk/jdk/pull/25302#discussion_r2119984783) > > ``` > raw_thread == nullptr > ``` > > This seems to happen rarely on (abrupt) shutdowns. I attached an hs_err file: [hs_err_pid1688961.log](https://github.com/user-attachments/files/20563594/hs_err_pid1688961.log) That is interesting. The signal appears to be being handled on an unattached thread during shutdown, and there is no stack left to show any VM involvement. Possibly we need to block the signal as part of thread termination, before we clear the current thread. ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2933916367 From jbechberger at openjdk.org Tue Jun 3 07:44:33 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 07:44:33 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v32] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Remove includes and other lines - Fix is_excluded ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/c3dedefb..ab47f680 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=31 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=30-31 Stats: 25 lines in 17 files changed: 4 ins; 5 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Tue Jun 3 07:44:34 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 07:44:34 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v31] In-Reply-To: <_4vRA_P9_dLG022vs8ZinaZmqC48drRAwdOSiDG9Wjk=.25880197-6c87-4faf-8259-12d6c0f10f2e@github.com> References: <_4vRA_P9_dLG022vs8ZinaZmqC48drRAwdOSiDG9Wjk=.25880197-6c87-4faf-8259-12d6c0f10f2e@github.com> Message-ID: On Tue, 3 Jun 2025 07:12:46 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp > > Co-authored-by: Aleksey Shipil?v > - Small fixes We already do: void JfrCPUTimeThreadSampler::on_javathread_terminate(JavaThread* thread) { JfrThreadLocal* tl = thread->jfr_thread_local(); assert(tl != nullptr, "invariant"); timer_t* timer = tl->cpu_timer(); if (timer == nullptr) { return; // no timer was created for this thread } timer_delete(*timer); ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2933945034 From mdoerr at openjdk.org Tue Jun 3 07:54:52 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 3 Jun 2025 07:54:52 GMT Subject: RFR: 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value In-Reply-To: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> References: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> Message-ID: On Mon, 2 Jun 2025 08:55:02 GMT, Axel Boldt-Christmas wrote: > The way that ZPlatformAddressOffsetBits is implemented on riscv and ppc may result in a return value of 45. This is larger than the max supported value of 44 (because of other internal data structures). This was fixed in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) for aarch64. > > Before [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) the issue on manifested if one tried to select a heap larger than 16 TB (not supported), but after [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) we try to double the heap address space when running on a NUMA machine. So we may now encounter this bug for heaps larger than 8TB (which is supported). > > While ZPlatformAddressOffsetBits needs an overhaul. (It was written for non-generational ZGC where we had the three color bits inside the address.) The proposal is that we solve this for ppc and riscv by doing the same thing we did for aarch64 in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) Thanks for fixing it! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25578#pullrequestreview-2891147685 From mdoerr at openjdk.org Tue Jun 3 08:02:00 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 3 Jun 2025 08:02:00 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References: Message-ID: <0808aEXLDKNUY6rsNCbjRjs_O0BaPLrCsX7q2zjpzus=.8ea987cf-fb41-47c5-9df3-840bc939f99a@github.com> On Mon, 2 Jun 2025 09:11:05 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove aarch64 from the change, adjust ppc64 src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp line 95: > 93: const size_t max_address_offset_bits = valid_max_address_offset_bits - 3; > 94: #ifdef ADDRESS_SANITIZER > 95: return MIN2(valid_max_address_offset_bits, (size_t)44); I think this PR is ok, but please add a comment like "The max supported value is 44 because of other internal data structures.". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2123048021 From aph at openjdk.org Tue Jun 3 08:37:51 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 3 Jun 2025 08:37:51 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 07:14:03 GMT, Liming Liu wrote: > This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. > > 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. > > 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. > > The performance regressions and improvements were measured with the following microbenchmarks: > org.openjdk.bench.java.util.TestCRC32.testCRC32Update > org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate > > Ran the following JTReg tests on Ampere1 and did not find problems: > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java src/hotspot/cpu/aarch64/globals_aarch64.hpp line 95: > 93: "Minimum size in bytes when Crypto PMULL will be used." \ > 94: "Value must be a multiple of 128.") \ > 95: range(256, max_jint) \ This shouldn't be a general product flag. Suggestion: product(intx, CryptoPmullForCRC32LowLimit, 256, DIAGNOSTIC, \ "Minimum size in bytes when Crypto PMULL will be used." \ "Value must be a multiple of 128.") \ range(256, max_jint) \ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2123135028 From jbechberger at openjdk.org Tue Jun 3 08:42:56 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 08:42:56 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v33] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix non Linux builds ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/ab47f680..ef9f9cd1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=32 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=31-32 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From shade at openjdk.org Tue Jun 3 08:45:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 08:45:53 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 08:36:58 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Address PR feedback: modify emum to be scoped. > > src/hotspot/share/code/nmethod.hpp line 498: > >> 496: >> 497: >> 498: static const char* NMethodChangeReason_to_string(NMethodChangeReason reason) { > > Uh, use a switch: > > > switch(reason) { > case C1_deoptimize: return "C1 deoptimized"; > case C1_codepatch: return "C1 code patch"; > ... > default: > assert(false, "Unhandled reason"); > return "Unknown"; > } Also, names: `change_reason_to_string(ChangeReason reason)`. Now that enum is scoped to `nmethod`, there is no need for `NMethod` prefix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2123148319 From shade at openjdk.org Tue Jun 3 08:45:52 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 08:45:52 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 05:40:09 GMT, Cesar Soares Lucas wrote: >> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. >> >> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Address PR feedback: modify emum to be scoped. Looks conceptually fine. Cosmetics: src/hotspot/share/code/nmethod.hpp line 498: > 496: > 497: > 498: static const char* NMethodChangeReason_to_string(NMethodChangeReason reason) { Uh, use a switch: switch(reason) { case C1_deoptimize: return "C1 deoptimized"; case C1_codepatch: return "C1 code patch"; ... default: assert(false, "Unhandled reason"); return "Unknown"; } src/hotspot/share/jvmci/jvmciEnv.cpp line 1755: > 1753: > 1754: > 1755: void JVMCIEnv::invalidate_nmethod_mirror(JVMCIObject mirror, bool deoptimize, nmethod::NMethodChangeReason statusReason, JVMCI_TRAPS) { Suggestion: void JVMCIEnv::invalidate_nmethod_mirror(JVMCIObject mirror, bool deoptimize, nmethod::NMethodChangeReason change_reason, JVMCI_TRAPS) { src/hotspot/share/jvmci/jvmciEnv.hpp line 465: > 463: // If `deoptimize` is true, the nmethod is immediately deoptimized. > 464: // The HotSpotNmethod.address field is zero upon returning. > 465: void invalidate_nmethod_mirror(JVMCIObject mirror, bool deoptimze, nmethod::NMethodChangeReason statusReason, JVMCI_TRAPS); Suggestion: void invalidate_nmethod_mirror(JVMCIObject mirror, bool deoptimize, nmethod::NMethodChangeReason change_reason, JVMCI_TRAPS); ------------- PR Review: https://git.openjdk.org/jdk/pull/25338#pullrequestreview-2891309126 PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2123138594 PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2123153641 PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2123152309 From dbriemann at openjdk.org Tue Jun 3 08:54:53 2025 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 3 Jun 2025 08:54:53 GMT Subject: RFR: 8358013: [PPC64] VSX has poor performance on Power8 [v5] In-Reply-To: References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: On Mon, 2 Jun 2025 14:14:29 GMT, Martin Doerr wrote: >> Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. >> >> Note: Power8 is an old processor and performance optimizations for it are no longer planned. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Beautify @requires statement. Marked as reviewed by dbriemann (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/25514#pullrequestreview-2891361980 From fyang at openjdk.org Tue Jun 3 08:56:51 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 3 Jun 2025 08:56:51 GMT Subject: RFR: 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value In-Reply-To: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> References: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> Message-ID: On Mon, 2 Jun 2025 08:55:02 GMT, Axel Boldt-Christmas wrote: > The way that ZPlatformAddressOffsetBits is implemented on riscv and ppc may result in a return value of 45. This is larger than the max supported value of 44 (because of other internal data structures). This was fixed in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) for aarch64. > > Before [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) the issue on manifested if one tried to select a heap larger than 16 TB (not supported), but after [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) we try to double the heap address space when running on a NUMA machine. So we may now encounter this bug for heaps larger than 8TB (which is supported). > > While ZPlatformAddressOffsetBits needs an overhaul. (It was written for non-generational ZGC where we had the three color bits inside the address.) The proposal is that we solve this for ppc and riscv by doing the same thing we did for aarch64 in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) Thanks. I tried some non-trivial benchmark workloads with ZGC on linux-riscv64, seems to work. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25578#pullrequestreview-2891368869 From fjiang at openjdk.org Tue Jun 3 09:04:55 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 3 Jun 2025 09:04:55 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v11] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 09:22:37 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> This adds the byte and halfword atomic memory operations (Zabha) - https://github.com/riscv/riscv-zabha. >> All amo-instructions, except load-reserve and store-conditional, can also be performed on natural aligned half-words and bytes. (i.e. the extension do not add lr.h/b or sc.h/b) This includes amocas if zacas extension is present. >> >> The majority of this patch is to support amocas.h/b. We are now starting to really feel the pain of all these extensions, as CAS:ing 16/8-bits can now be done in three different ways: >> - lr.w/sc.w 'narrow' CAS (no extension) >> - amocas.w 'narrow' CAS (Zacas) >> - amocas.h/b (Zacas + Zabha) >> >> There is no hwprobe support yet. >> >> Ran t1-3 with Zacas+Zabha and t1 without Zabha in qemu. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: > > - Set ins cost to 2xVOLA for cmpxchg > - Merge branch 'master' into 8356159 > - Merge branch 'master' into 8356159 > - ins cost fixes, print fixes > - Merge branch 'master' into 8356159 > - Reg limits fixed > - Merge branch 'master' into 8356159 > - Fixed reg selection > - More indention > - Indention > - ... and 10 more: https://git.openjdk.org/jdk/compare/f7e126de...b496c299 src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4151: > 4149: zext(prev, prev, 32); > 4150: break; > 4151: case int16: The call site of `atomic_cas` is only guaranteed by `UseZacas`. Do we need extra checking for `UseZabha` if the operand size is `int16` or `int8`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25252#discussion_r2123175989 From clanger at openjdk.org Tue Jun 3 09:27:06 2025 From: clanger at openjdk.org (Christoph Langer) Date: Tue, 3 Jun 2025 09:27:06 GMT Subject: RFR: 8358013: [PPC64] VSX has poor performance on Power8 [v5] In-Reply-To: References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: On Mon, 2 Jun 2025 14:14:29 GMT, Martin Doerr wrote: >> Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. >> >> Note: Power8 is an old processor and performance optimizations for it are no longer planned. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Beautify @requires statement. Marked as reviewed by clanger (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25514#pullrequestreview-2891469201 From mdoerr at openjdk.org Tue Jun 3 09:27:07 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 3 Jun 2025 09:27:07 GMT Subject: RFR: 8358013: [PPC64] VSX has poor performance on Power8 [v5] In-Reply-To: References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: On Mon, 2 Jun 2025 14:14:29 GMT, Martin Doerr wrote: >> Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. >> >> Note: Power8 is an old processor and performance optimizations for it are no longer planned. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Beautify @requires statement. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25514#issuecomment-2934323304 From mdoerr at openjdk.org Tue Jun 3 09:27:07 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 3 Jun 2025 09:27:07 GMT Subject: Integrated: 8358013: [PPC64] VSX has poor performance on Power8 In-Reply-To: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> References: <6lRLaDtZkFd5zdOobo1RnSODoZk3r7T-sgjfpcnUVwU=.ad525055-0f15-4866-a295-20e2183eaf7b@github.com> Message-ID: On Wed, 28 May 2025 22:23:57 GMT, Martin Doerr wrote: > Power8 only has limited VSX instructions for the superword optimization and the Vector API and the performance is bad. Let's only use it on Power9 and newer by default. This change excludes the VSX registers from C2 register allocation for Power8. VSX instruction usage gets limited to a few places like intrinsics. > > Note: Power8 is an old processor and performance optimizations for it are no longer planned. This pull request has now been integrated. Changeset: 457d9de8 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/457d9de81d0f65455e3292fafea03f0e83184029 Stats: 14 lines in 4 files changed: 10 ins; 0 del; 4 mod 8358013: [PPC64] VSX has poor performance on Power8 Reviewed-by: dbriemann, clanger ------------- PR: https://git.openjdk.org/jdk/pull/25514 From rehn at openjdk.org Tue Jun 3 09:55:01 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 3 Jun 2025 09:55:01 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v11] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 08:53:17 GMT, Feilong Jiang wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 20 additional commits since the last revision: >> >> - Set ins cost to 2xVOLA for cmpxchg >> - Merge branch 'master' into 8356159 >> - Merge branch 'master' into 8356159 >> - ins cost fixes, print fixes >> - Merge branch 'master' into 8356159 >> - Reg limits fixed >> - Merge branch 'master' into 8356159 >> - Fixed reg selection >> - More indention >> - Indention >> - ... and 10 more: https://git.openjdk.org/jdk/compare/ab62c13b...b496c299 > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4151: > >> 4149: zext(prev, prev, 32); >> 4150: break; >> 4151: case int16: > > The call site of `atomic_cas` is only guaranteed by `UseZacas`. Do we need extra checking for `UseZabha` if the operand size is `int16` or `int8`? If nothing else the assembler always checks: void amo_base(Register Rd, Register Rs1, uint8_t Rs2, Aqrl memory_order = aqrl) { assert(width > AMO_WIDTH_HALFWORD || UseZabha, "Must be"); assert(funct5 != AMO_CAS || UseZacas, "Must be"); I'll have a look! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25252#discussion_r2123319547 From kbarrett at openjdk.org Tue Jun 3 10:03:51 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 3 Jun 2025 10:03:51 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: <8iAAoW0GY4ICp4SYUr_vP7hlHdsfL9mYxVg2Tk3eeP4=.930f8b97-6e0e-4420-a268-96729b425ac6@github.com> On Mon, 2 Jun 2025 19:38:52 GMT, Daniel D. Daugherty wrote: > I do have a query about whether the mention of `nothrow` should be `noexcept`. "nothrow" (not an identifier, not code font) is a commonly used informal term, and is reflected in the names of type traits like `is_nothrow_constructible<>`. The Standard seems to consistently use "non-throwing exception specification" in text. Terminology might have been different if C++ had `noexcept` to start with. I don't have a strong preference here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2934461969 From kvn at openjdk.org Tue Jun 3 10:47:25 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 10:47:25 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v2] In-Reply-To: References: Message-ID: > There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. > > I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. > > I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. > > Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. > > Testing tier1-3, xcomp, stress. Higher tiers are still running. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Remove unused argument ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25604/files - new: https://git.openjdk.org/jdk/pull/25604/files/b03c5070..9b67ceab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25604&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25604&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25604/head:pull/25604 PR: https://git.openjdk.org/jdk/pull/25604 From kbarrett at openjdk.org Tue Jun 3 11:01:06 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 3 Jun 2025 11:01:06 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: more dholmes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25574/files - new: https://git.openjdk.org/jdk/pull/25574/files/e6decd1f..2bbfbeee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25574&range=01-02 Stats: 3 lines in 2 files changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25574/head:pull/25574 PR: https://git.openjdk.org/jdk/pull/25574 From kbarrett at openjdk.org Tue Jun 3 11:01:06 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 3 Jun 2025 11:01:06 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v2] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 21:07:34 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> dholmes review > > doc/hotspot-style.html line 1121: > >> 1119:
  • Only the argument-less form of noexcept exception >> 1120: specifications are permitted. noexcept exception >> 1121: specifications with arguments are forbidden.
  • > > I was suggesting dropping the second sentence as it is implied by the first. Oh, I see. Sure. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2123457546 From jbechberger at openjdk.org Tue Jun 3 11:32:18 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 11:32:18 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v31] In-Reply-To: References: <_4vRA_P9_dLG022vs8ZinaZmqC48drRAwdOSiDG9Wjk=.25880197-6c87-4faf-8259-12d6c0f10f2e@github.com> Message-ID: On Tue, 3 Jun 2025 07:36:06 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with three additional commits since the last revision: >> >> - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Small fixes > >> Regarding [#25302 (comment)](https://github.com/openjdk/jdk/pull/25302#discussion_r2119984783) >> >> ``` >> raw_thread == nullptr >> ``` >> >> This seems to happen rarely on (abrupt) shutdowns. I attached an hs_err file: [hs_err_pid1688961.log](https://github.com/user-attachments/files/20563594/hs_err_pid1688961.log) > > That is interesting. The signal appears to be being handled on an unattached thread during shutdown, and there is no stack left to show any VM involvement. Possibly we need to block the signal as part of thread termination, before we clear the current thread. ? Regarding the acquire-release-semantics (cc @dholmes-ora): I currently use it, because it works and is fast enough. Using a weaker semantics is a good optimization, but I would abstain it for new due to time constraints. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2934793239 From jbechberger at openjdk.org Tue Jun 3 11:42:27 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 11:42:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v34] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Use store-release semantics - Log error when timer_create fails ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/ef9f9cd1..93b5a189 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=33 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=32-33 Stats: 10 lines in 3 files changed: 1 ins; 1 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From rehn at openjdk.org Tue Jun 3 11:52:43 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 3 Jun 2025 11:52:43 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v12] In-Reply-To: References: Message-ID: > Hi, please consider. > > This adds the byte and halfword atomic memory operations (Zabha) - https://github.com/riscv/riscv-zabha. > All amo-instructions, except load-reserve and store-conditional, can also be performed on natural aligned half-words and bytes. (i.e. the extension do not add lr.h/b or sc.h/b) This includes amocas if zacas extension is present. > > The majority of this patch is to support amocas.h/b. We are now starting to really feel the pain of all these extensions, as CAS:ing 16/8-bits can now be done in three different ways: > - lr.w/sc.w 'narrow' CAS (no extension) > - amocas.w 'narrow' CAS (Zacas) > - amocas.h/b (Zacas + Zabha) > > There is no hwprobe support yet. > > Ran t1-3 with Zacas+Zabha and t1 without Zabha in qemu. > > Thanks, Robbin Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: - Merge branch 'master' into 8356159 - Set ins cost to 2xVOLA for cmpxchg - Merge branch 'master' into 8356159 - Merge branch 'master' into 8356159 - ins cost fixes, print fixes - Merge branch 'master' into 8356159 - Reg limits fixed - Merge branch 'master' into 8356159 - Fixed reg selection - More indention - ... and 11 more: https://git.openjdk.org/jdk/compare/2fdd35a2...cc3b8ff7 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25252/files - new: https://git.openjdk.org/jdk/pull/25252/files/b496c299..cc3b8ff7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25252&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25252&range=10-11 Stats: 27302 lines in 350 files changed: 7274 ins; 12599 del; 7429 mod Patch: https://git.openjdk.org/jdk/pull/25252.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25252/head:pull/25252 PR: https://git.openjdk.org/jdk/pull/25252 From rehn at openjdk.org Tue Jun 3 11:52:44 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 3 Jun 2025 11:52:44 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v11] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:51:46 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4151: >> >>> 4149: zext(prev, prev, 32); >>> 4150: break; >>> 4151: case int16: >> >> The call site of `atomic_cas` is only guaranteed by `UseZacas`. Do we need extra checking for `UseZabha` if the operand size is `int16` or `int8`? > > If nothing else the assembler always checks: > > void amo_base(Register Rd, Register Rs1, uint8_t Rs2, Aqrl memory_order = aqrl) { > assert(width > AMO_WIDTH_HALFWORD || UseZabha, "Must be"); > assert(funct5 != AMO_CAS || UseZacas, "Must be"); > > > I'll have a look! When we call amocas with unknown size (i.e. not hardcoded int64/int32) we have this assert in the caller: `assert((UseZacas && UseZabha) || (size != int8 && size != int16), "unsupported operand size");` So it seems like we should be fine, no? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25252#discussion_r2123563891 From coleenp at openjdk.org Tue Jun 3 12:02:53 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 3 Jun 2025 12:02:53 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` I don't think this is the right thing to do, since the Method* is already handled in finalize_oop_references since it's a backpointer. And MethodCounters shouldn't be inhertited from Metadata, they're inherited from MetaspaceObj in mainline. We want to avoid virtual function pointers in this type. ------------- PR Review: https://git.openjdk.org/jdk/pull/25599#pullrequestreview-2892022311 PR Review: https://git.openjdk.org/jdk/pull/25599#pullrequestreview-2892026544 From jbechberger at openjdk.org Tue Jun 3 12:12:53 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 12:12:53 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v35] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Fix include order - Tiny refactoring ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/93b5a189..71611f1e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=34 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=33-34 Stats: 19 lines in 4 files changed: 8 ins; 10 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From fjiang at openjdk.org Tue Jun 3 12:14:55 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 3 Jun 2025 12:14:55 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v11] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 11:46:27 GMT, Robbin Ehn wrote: >> If nothing else the assembler always checks: >> >> void amo_base(Register Rd, Register Rs1, uint8_t Rs2, Aqrl memory_order = aqrl) { >> assert(width > AMO_WIDTH_HALFWORD || UseZabha, "Must be"); >> assert(funct5 != AMO_CAS || UseZacas, "Must be"); >> >> >> I'll have a look! > > When we call amocas with unknown size (i.e. not hardcoded int64/int32) we have this assert in the caller: > `assert((UseZacas && UseZabha) || (size != int8 && size != int16), "unsupported operand size");` > > So it seems like we should be fine, no? Looks like `UseZabha` relies on `UseZacas`? The following code will call `atomic_cas` only when `UseZacas` is true even if the size is `int8` or `int16`. If that is true (maybe `(UseZacas && UseZabha)` already explained), then it makes sense. https://github.com/openjdk/jdk/blob/78a392aa3b0cda52cfacfa15250fa61010519424/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L4019-L4031 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25252#discussion_r2123625099 From aboldtch at openjdk.org Tue Jun 3 12:17:57 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 3 Jun 2025 12:17:57 GMT Subject: RFR: 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value In-Reply-To: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> References: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> Message-ID: On Mon, 2 Jun 2025 08:55:02 GMT, Axel Boldt-Christmas wrote: > The way that ZPlatformAddressOffsetBits is implemented on riscv and ppc may result in a return value of 45. This is larger than the max supported value of 44 (because of other internal data structures). This was fixed in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) for aarch64. > > Before [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) the issue on manifested if one tried to select a heap larger than 16 TB (not supported), but after [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) we try to double the heap address space when running on a NUMA machine. So we may now encounter this bug for heaps larger than 8TB (which is supported). > > While ZPlatformAddressOffsetBits needs an overhaul. (It was written for non-generational ZGC where we had the three color bits inside the address.) The proposal is that we solve this for ppc and riscv by doing the same thing we did for aarch64 in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25578#issuecomment-2934956974 From aboldtch at openjdk.org Tue Jun 3 12:17:58 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 3 Jun 2025 12:17:58 GMT Subject: Integrated: 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value In-Reply-To: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> References: <6j_zozeh-Vwu3tRHRlJ5h_mhcMFsNm_OMUinAosz8fU=.d51c8c95-aad1-4566-a23b-8da5b521aa90@github.com> Message-ID: On Mon, 2 Jun 2025 08:55:02 GMT, Axel Boldt-Christmas wrote: > The way that ZPlatformAddressOffsetBits is implemented on riscv and ppc may result in a return value of 45. This is larger than the max supported value of 44 (because of other internal data structures). This was fixed in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) for aarch64. > > Before [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) the issue on manifested if one tried to select a heap larger than 16 TB (not supported), but after [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441) we try to double the heap address space when running on a NUMA machine. So we may now encounter this bug for heaps larger than 8TB (which is supported). > > While ZPlatformAddressOffsetBits needs an overhaul. (It was written for non-generational ZGC where we had the three color bits inside the address.) The proposal is that we solve this for ppc and riscv by doing the same thing we did for aarch64 in [JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275) This pull request has now been integrated. Changeset: 46183742 Author: Axel Boldt-Christmas URL: https://git.openjdk.org/jdk/commit/4618374269e8636c772d921ad0c2c2d9e5e3e643 Stats: 10 lines in 2 files changed: 4 ins; 0 del; 6 mod 8358310: ZGC: riscv, ppc ZPlatformAddressOffsetBits may return a too large value Reviewed-by: eosterlund, mdoerr, fyang ------------- PR: https://git.openjdk.org/jdk/pull/25578 From egahlin at openjdk.org Tue Jun 3 12:19:04 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 3 Jun 2025 12:19:04 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v10] In-Reply-To: References: Message-ID: <8ESOaNI_qHLzLquiZT7RZR43lit-o8_5rTky1nJFjH4=.a81b8882-1470-4f76-8c9a-cdc2a7b50070@github.com> On Mon, 26 May 2025 15:42:57 GMT, Erik Gahlin wrote: >>> This is added automatically. If I add "(Experimental)" to the title, then I get "X (Experimental) (Experimental)" >> >> Sweet. >> >>> I'm unsure how to implement this using the SQL version that is used for the views >> >> I will see if I can create an example with some other events that show the syntax, and then you can fill in the CPU-Time events. > >> I will see if I can create an example with some other events that show the syntax, and then you can fill in the CPU-Time events. > > I have a Mac, so I could not try it with an actual recording, but something like this: > > [application.cpu-time-statistics] > label = "CPU Time Samples Statistics" > form = "COLUMN 'Successful Samples', 'Failed Samples', 'Total Samples', 'Lost Samples' > SELECT COUNT(S.startTime), COUNT(F.startTime), Count(A.startTime), SUM(L.lostSamples) > FROM > CPUTimeSample AS S, > CPUTimeSample AS F, > CPUTimeSample AS A, > CPUTimeSampleLoss AS L > WHERE > S.failed = 'false' AND > F.failed = 'true'" > > > I removed biased, because I wonder If we should have such a field? There can be many types of biases, and the implementation may change in the future. > Hold on, shouldn't this really be "Lost"? @egahlin and @mgronlun need to chime in here. Lost might be better. I wonder if `` is needed, instead of thread = true? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2934959447 From jbechberger at openjdk.org Tue Jun 3 12:19:05 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 12:19:05 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v10] In-Reply-To: <8ESOaNI_qHLzLquiZT7RZR43lit-o8_5rTky1nJFjH4=.a81b8882-1470-4f76-8c9a-cdc2a7b50070@github.com> References: <8ESOaNI_qHLzLquiZT7RZR43lit-o8_5rTky1nJFjH4=.a81b8882-1470-4f76-8c9a-cdc2a7b50070@github.com> Message-ID: <1JqKzjCGoZ9N_ez_gMKOlR1lbWPte0LkQS3bSb81ua0=.3c4c006b-18c0-4444-a867-8c774899b5b9@github.com> On Tue, 3 Jun 2025 12:15:06 GMT, Erik Gahlin wrote: > I wonder if is needed, instead of thread = true? We had these discussions before on the old PR and then decided to end up with eventThread (as the other events do to), ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2934963523 From mgronlun at openjdk.org Tue Jun 3 12:19:06 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 12:19:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v35] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:12:53 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Fix include order > - Tiny refactoring src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 67: > 65: } > 66: > 67: if (is_excluded(jt)) { I think move this before the jt->is_exiting() check - excluded is much much more common than exiting... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123633391 From shade at openjdk.org Tue Jun 3 12:19:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 12:19:53 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Tue, 3 Jun 2025 11:58:29 GMT, Coleen Phillimore wrote: > I don't think this is the right thing to do, since the Method* is already handled in finalize_oop_references since it's a backpointer. Sorry, I don't understand this comment. I think there is a symmetry between `MethodCounters` and `MethodData`. Now that `MethodCounters` have the backpointer to `Method*`, like `MethodData`, it should be handled like `MethodData` everywhere? > And MethodCounters shouldn't be inhertited from Metadata, they're inherited from MetaspaceObj in mainline. We want to avoid virtual function pointers in this type. Are you, perhaps, looking at older mainline? Because in current mainline `MethodCounters` is inherited from `Metadata`: https://github.com/openjdk/jdk/blob/78a392aa3b0cda52cfacfa15250fa61010519424/src/hotspot/share/oops/methodCounters.hpp#L35 -- this was also part of [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25599#issuecomment-2934965605 From mgronlun at openjdk.org Tue Jun 3 12:25:05 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 12:25:05 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v35] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:12:53 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Fix include order > - Tiny refactoring src/hotspot/share/jfr/support/jfrThreadLocal.hpp line 371: > 369: timer_t* cpu_timer() const; > 370: > 371: // The CPU time JFR lock has four different states: Only three different states now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123651646 From jbechberger at openjdk.org Tue Jun 3 12:29:48 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 12:29:48 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v36] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - CPUTimeSampleLoss -> CPUTimeSamplesLost - Move is_excluded forward ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/71611f1e..a419daba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=35 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=34-35 Stats: 13 lines in 7 files changed: 0 ins; 1 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From azafari at openjdk.org Tue Jun 3 12:34:55 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 3 Jun 2025 12:34:55 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <6HSruHtZNPOZJp4vNFnwMns6-_rP_MEHtnnvAP7S5QU=.e91023a2-089c-4541-86a5-ae8d4adeb99d@github.com> On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan In ASAN built JDK, some gtests and some other JTREG tests in runtime/ErrorHandling also fail. Do we exclude these in another PR? or should they also be handled/excluded here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2935018245 From jbechberger at openjdk.org Tue Jun 3 12:35:54 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 12:35:54 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v37] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix tiny mistake ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/a419daba..44c37d17 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=36 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=35-36 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Tue Jun 3 12:35:55 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 12:35:55 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v35] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:12:53 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Fix include order > - Tiny refactoring src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 235: > 233: > 234: void JfrCPUTimeThreadSampler::on_javathread_create(JavaThread* thread) { > 235: if (thread->is_Compiler_thread()) { is_hidden_from_external_view() + is_JfrRecorderThread() instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123668189 From jbechberger at openjdk.org Tue Jun 3 12:39:47 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 12:39:47 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v38] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Restrict threads for which timers are created ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/44c37d17..83b55f58 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=37 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=36-37 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Tue Jun 3 12:39:49 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 12:39:49 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v35] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:29:39 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fix include order >> - Tiny refactoring > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 235: > >> 233: >> 234: void JfrCPUTimeThreadSampler::on_javathread_create(JavaThread* thread) { >> 235: if (thread->is_Compiler_thread()) { > > is_hidden_from_external_view() + is_JfrRecorderThread() instead? tl->is_excluded() is volatile and can change during runtime, so it's better to add a timer unconditionally there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123669984 From jbechberger at openjdk.org Tue Jun 3 12:39:49 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 12:39:49 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v35] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:30:18 GMT, Markus Gr?nlund wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 235: >> >>> 233: >>> 234: void JfrCPUTimeThreadSampler::on_javathread_create(JavaThread* thread) { >>> 235: if (thread->is_Compiler_thread()) { >> >> is_hidden_from_external_view() + is_JfrRecorderThread() instead? > > tl->is_excluded() is volatile and can change during runtime, so it's better to add a timer unconditionally there. why not just `is_excluded`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123673323 From mgronlun at openjdk.org Tue Jun 3 12:44:09 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 12:44:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v38] In-Reply-To: References: Message-ID: <23UojxlSZeRcCB38B1d2hDIcyXbgRnmbi8Vu3cfUmM4=.ba379d71-7bc7-4721-bda0-6d5f469a45f7@github.com> On Tue, 3 Jun 2025 12:39:47 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Restrict threads for which timers are created src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 63: > 61: assert(raw_thread->is_Java_thread(), "invariant"); > 62: JavaThread* jt = JavaThread::cast(raw_thread); > 63: if (is_excluded(jt)) { and now: if (is_excluded(jt) || jt->is_exiting()) { return nullptr; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123692657 From jbechberger at openjdk.org Tue Jun 3 12:47:48 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 12:47:48 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v39] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Tiny refactoring ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/83b55f58..2b8c6db4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=38 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=37-38 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Tue Jun 3 12:51:13 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 12:51:13 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v38] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:39:47 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Restrict threads for which timers are created src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 327: > 325: JfrThreadLocal* tl = jt->jfr_thread_local(); > 326: if (tl->wants_async_processing_of_cpu_time_jfr_requests()) { > 327: if (!jt->has_last_Java_frame() || jt->thread_state() != _thread_in_native || !tl->try_acquire_cpu_time_jfr_dequeue_lock()) { I recommend this order for higher probability: 1. jt->thread_state() != _thread_in_native 2. !tl->try_acquire_cpu_time_jfr_dequeue_lock() 3. !jt->has_last_Java_frame() You need to restructure of course, to get the unlocking correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123711980 From coleenp at openjdk.org Tue Jun 3 12:55:05 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 3 Jun 2025 12:55:05 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Tue, 3 Jun 2025 07:16:47 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Moved jtreg test > - Improved documentation > - Fix coding style (asterisk placement) Thanks for these coding style fixes. Some IDEs choose the other placement for the asterisk which makes it annoying. Thanks for moving the test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2935098090 From jbechberger at openjdk.org Tue Jun 3 13:00:06 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 13:00:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v40] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Reorder condition ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/2b8c6db4..56ce2b05 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=39 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=38-39 Stats: 6 lines in 1 file changed: 4 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Tue Jun 3 13:00:06 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 13:00:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v38] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:39:47 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Restrict threads for which timers are created src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 479: > 477: > 478: // Entry point for a thread that has been sampled in native code and has a pending JFR CPU time request. > 479: void JfrThreadSampling::process_cpu_time_request(JavaThread* jt, JfrThreadLocal* tl, Thread* current, bool lock) { Can you move this up to be co-located with "drain_enqueued_cpu_time_requests"? Thanks. src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.hpp line 40: > 38: public: > 39: static void process_sample_request(JavaThread* jt, bool has_cpu_time_sample_request); > 40: static void process_cpu_time_request(JavaThread* jt, JfrThreadLocal* tl, Thread* current, bool lock); Put this under private and add JfrCPUTimeThreadSampler as a friend (like above with JfrSamplerThread and "process_native_sample_requests" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123728400 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2123726144 From rehn at openjdk.org Tue Jun 3 13:17:56 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 3 Jun 2025 13:17:56 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v11] In-Reply-To: References: Message-ID: <_RXza4TCeXVWANzCikEnlud-YBc00BEBvCCH_OiHimg=.f5ebd65e-097c-44fd-b952-fa6d36a1e47f@github.com> On Tue, 3 Jun 2025 12:10:07 GMT, Feilong Jiang wrote: >> When we call amocas with unknown size (i.e. not hardcoded int64/int32) we have this assert in the caller: >> `assert((UseZacas && UseZabha) || (size != int8 && size != int16), "unsupported operand size");` >> >> So it seems like we should be fine, no? > > Looks like `UseZabha` relies on `UseZacas`? The following code will call `atomic_cas` only when `UseZacas` is true even if the size is `int8` or `int16`. If that is true (maybe `(UseZacas && UseZabha)` already explained), then it makes sense. > > https://github.com/openjdk/jdk/blob/78a392aa3b0cda52cfacfa15250fa61010519424/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L4019-L4031 You can't use that method for int8/int16 if you don't have UseZabha (and Zacas). You must call the cmpxchg_narrow(). The reason why we have two different method is the number of registers needed, i.e. cmpxchg_narrow requires scratch registers. So when you don't have UseZabha you should not be calling this method at all for int8/int16. There is an assert above that checks: `assert((UseZacas && UseZabha) || (size != int8 && size != int16), "unsupported operand size");` The relationship is: UseZacas=false and UseZabha=false => LR/SC and narrow LR/SC for sub word size. UseZacas=true and UseZabha=false => amocas and narrow amocas for sub word size. UseZacas=false and UseZabha=true => LR/SC and narrow LR/SC for sub word size. UseZacas=true and UseZabha=true => amocas and amocas for sub word size. There is no LR/SC for sub word sizes, int8/int16, thus without Zacas they always use narrow LR/SC. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25252#discussion_r2123776682 From fjiang at openjdk.org Tue Jun 3 13:24:55 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 3 Jun 2025 13:24:55 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v12] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 11:52:43 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> This adds the byte and halfword atomic memory operations (Zabha) - https://github.com/riscv/riscv-zabha. >> All amo-instructions, except load-reserve and store-conditional, can also be performed on natural aligned half-words and bytes. (i.e. the extension do not add lr.h/b or sc.h/b) This includes amocas if zacas extension is present. >> >> The majority of this patch is to support amocas.h/b. We are now starting to really feel the pain of all these extensions, as CAS:ing 16/8-bits can now be done in three different ways: >> - lr.w/sc.w 'narrow' CAS (no extension) >> - amocas.w 'narrow' CAS (Zacas) >> - amocas.h/b (Zacas + Zabha) >> >> There is no hwprobe support yet. >> >> Ran t1-3 with Zacas+Zabha and t1 without Zabha in qemu. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: > > - Merge branch 'master' into 8356159 > - Set ins cost to 2xVOLA for cmpxchg > - Merge branch 'master' into 8356159 > - Merge branch 'master' into 8356159 > - ins cost fixes, print fixes > - Merge branch 'master' into 8356159 > - Reg limits fixed > - Merge branch 'master' into 8356159 > - Fixed reg selection > - More indention > - ... and 11 more: https://git.openjdk.org/jdk/compare/1d76abc0...cc3b8ff7 Looks good! ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/25252#pullrequestreview-2892342660 From fjiang at openjdk.org Tue Jun 3 13:24:56 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 3 Jun 2025 13:24:56 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v11] In-Reply-To: <_RXza4TCeXVWANzCikEnlud-YBc00BEBvCCH_OiHimg=.f5ebd65e-097c-44fd-b952-fa6d36a1e47f@github.com> References: <_RXza4TCeXVWANzCikEnlud-YBc00BEBvCCH_OiHimg=.f5ebd65e-097c-44fd-b952-fa6d36a1e47f@github.com> Message-ID: On Tue, 3 Jun 2025 13:14:26 GMT, Robbin Ehn wrote: >> Looks like `UseZabha` relies on `UseZacas`? The following code will call `atomic_cas` only when `UseZacas` is true even if the size is `int8` or `int16`. If that is true (maybe `(UseZacas && UseZabha)` already explained), then it makes sense. >> >> https://github.com/openjdk/jdk/blob/78a392aa3b0cda52cfacfa15250fa61010519424/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L4019-L4031 > > You can't use that method for int8/int16 if you don't have UseZabha (and Zacas). > You must call the cmpxchg_narrow(). > > The reason why we have two different method is the number of registers needed, i.e. cmpxchg_narrow requires scratch registers. > > So when you don't have UseZabha you should not be calling this method at all for int8/int16. > > There is an assert above that checks: > `assert((UseZacas && UseZabha) || (size != int8 && size != int16), "unsupported operand size");` > > The relationship is: > > UseZacas=false and UseZabha=false => LR/SC and narrow LR/SC for sub word size. > UseZacas=true and UseZabha=false => amocas and narrow amocas for sub word size. > UseZacas=false and UseZabha=true => LR/SC and narrow LR/SC for sub word size. > UseZacas=true and UseZabha=true => amocas and amocas for sub word size. > > > There is no LR/SC for sub word sizes, int8/int16, thus without Zacas they always use narrow LR/SC. Ah, I missed `(size != int8 && size != int16)` assertion. So it should be fine, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25252#discussion_r2123799385 From jbechberger at openjdk.org Tue Jun 3 13:33:29 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 13:33:29 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v41] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Make process_cpu_time_request private and move up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/56ce2b05..7561d512 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=40 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=39-40 Stats: 17 lines in 2 files changed: 9 ins; 8 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From asmehra at openjdk.org Tue Jun 3 13:57:51 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 3 Jun 2025 13:57:51 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v2] In-Reply-To: References: Message-ID: <2eSFIbD9m61pBTA64R6x5UyMn5eIjRJQZh48l3sh7yo=.f2b10311-77df-4007-b939-9eae802264b8@github.com> On Tue, 3 Jun 2025 10:47:25 GMT, Vladimir Kozlov wrote: >> There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. >> >> I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. >> >> I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. >> >> Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. >> >> Testing tier1-3, xcomp, stress. Higher tiers are still running. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused argument Marked as reviewed by asmehra (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25604#pullrequestreview-2892508069 From jbechberger at openjdk.org Tue Jun 3 14:09:29 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Tue, 3 Jun 2025 14:09:29 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Rename autoadapt ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/7561d512..ae55610c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=41 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=40-41 Stats: 41 lines in 8 files changed: 0 ins; 0 del; 41 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From dchuyko at openjdk.org Tue Jun 3 14:09:35 2025 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Tue, 3 Jun 2025 14:09:35 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v6] In-Reply-To: References: Message-ID: > This is an implementation of SHA3 intrinsics for AArch64 that operates GPRs. It follows the Java implementation algorithm but eagerly uses available registers. For example, FP+R18 are used when it's allowed. On simpler cores like RPi3 or Surface Pro it is 23-53% faster than C2 compiled version; on Graviton 3 it is 8-14% faster than C2 compiled version (which is faster than the current intrinsic); on Apple Silicon it is faster than C2 compiled version but slower than the ARMv8.2-SHA intrinsic. Improvements on a particular CPU depend on the input length. For instance, for Graviton 2: > > > Benchmark (ops/ms) (digesterName) (length) G2 > MessageDigests.digest SHA3-256 64 28.28% > MessageDigests.digest SHA3-256 16384 53.58% > MessageDigests.digest SHA3-512 64 27.97% > MessageDigests.digest SHA3-512 16384 43.90% > MessageDigests.getAndDigest SHA3-256 64 26.18% > MessageDigests.getAndDigest SHA3-256 16384 52.82% > MessageDigests.getAndDigest SHA3-512 64 24.73% > MessageDigests.getAndDigest SHA3-512 16384 44.31% > > > (results for intermediate input lengths look like steps) > > On Graviton 4 there is still a noticeable difference between the proposed implementation and C2 generated code: > > > Benchmark (digesterName) (length) Pct > MessageDigests.digest SHA3-256 64 8.3% > MessageDigests.digest SHA3-256 16384 11% > MessageDigests.digest SHA3-512 64 8.4% > MessageDigests.digest SHA3-512 16384 11.5% > MessageDigests.getAndDigest SHA3-256 64 7.2% > MessageDigests.getAndDigest SHA3-256 16384 11% > MessageDigests.getAndDigest SHA3-512 64 7.3% > MessageDigests.getAndDigest SHA3-512 16384 11.6% > > > and the version that uses the extension is ~1.8x slower than C2 > > Existing intrinsic implementation is put under a flag `UseSIMDForSHA3Intrinsic` which is on by default where the intrinsic is enabled currently. > > Sanity tests were modified to cover new intrinsic variants (`-XX:-UseSIMDForSHA3Intrinsic -XX:+-PreserveFramePointer`) on aarch64 hw. Existing test cases where intrinsic is enabled are executed with `-XX:+IgnoreUnrecognizedVMOptions -XX:+UseSIMDForSHA3Intrinsic`, on platforms where the sha3 extension is missing they still are cut off by isSHA3IntrinsicAvailable() predicate. > > The original PR https://github.com/openjdk/jdk/pull/20422 has been auto-closed and the branch has been re-created on top of the new master. Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge branch 'openjdk:master' into JDK-8337666 - Update src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp Co-authored-by: Andrew Haley - Update src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp Co-authored-by: Andrew Haley - Merge branch 'openjdk:master' into JDK-8337666 - Assert message - Copyright year - Review suggestions - Merge master - Delete empty line - SHA3 GPR intrinsic & tests ------------- Changes: https://git.openjdk.org/jdk/pull/24260/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24260&range=05 Stats: 749 lines in 6 files changed: 743 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24260.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24260/head:pull/24260 PR: https://git.openjdk.org/jdk/pull/24260 From aph at openjdk.org Tue Jun 3 14:24:59 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 3 Jun 2025 14:24:59 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v4] In-Reply-To: References: <47P15HTCeTU93mVEKekG-smYjt5ebvSMJ8bgbG28vEI=.5f49753a-7ff5-4154-80e2-cd4fc996119f@github.com> Message-ID: On Sat, 31 May 2025 08:39:36 GMT, Andrew Haley wrote: >> Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge branch 'openjdk:master' into JDK-8337666 >> - Assert message >> - Copyright year >> - Review suggestions >> - Merge master >> - Delete empty line >> - SHA3 GPR intrinsic & tests > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 331: > >> 329: >> 330: inline void rol(Register Rd, Register Rn, unsigned imm) { >> 331: extr(Rd, Rn, Rn, ((64 - imm) & 63)); > > Suggestion: > > extr(Rd, Rn, Rn, (64 - imm)); > > It's better to catch an out-of-range immediate value. `rolw` too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24260#discussion_r2124008640 From mgronlun at openjdk.org Tue Jun 3 16:25:07 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 16:25:07 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v35] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 12:31:53 GMT, Johannes Bechberger wrote: >> tl->is_excluded() is volatile and can change during runtime, so it's better to add a timer unconditionally there. > > why not just `is_excluded`? because tl->is_excluded() can get included and excluded many times during runtime. Its not a static property. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124366551 From dchuyko at openjdk.org Tue Jun 3 16:31:08 2025 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Tue, 3 Jun 2025 16:31:08 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v7] In-Reply-To: References: Message-ID: <3hGFnUsyrN809lwWuqr7dyxfoCm0F2ILSB-yJV5Hfvo=.1a2048d4-d131-4354-a629-d75f206dda42@github.com> > This is an implementation of SHA3 intrinsics for AArch64 that operates GPRs. It follows the Java implementation algorithm but eagerly uses available registers. For example, FP+R18 are used when it's allowed. On simpler cores like RPi3 or Surface Pro it is 23-53% faster than C2 compiled version; on Graviton 3 it is 8-14% faster than C2 compiled version (which is faster than the current intrinsic); on Apple Silicon it is faster than C2 compiled version but slower than the ARMv8.2-SHA intrinsic. Improvements on a particular CPU depend on the input length. For instance, for Graviton 2: > > > Benchmark (ops/ms) (digesterName) (length) G2 > MessageDigests.digest SHA3-256 64 28.28% > MessageDigests.digest SHA3-256 16384 53.58% > MessageDigests.digest SHA3-512 64 27.97% > MessageDigests.digest SHA3-512 16384 43.90% > MessageDigests.getAndDigest SHA3-256 64 26.18% > MessageDigests.getAndDigest SHA3-256 16384 52.82% > MessageDigests.getAndDigest SHA3-512 64 24.73% > MessageDigests.getAndDigest SHA3-512 16384 44.31% > > > (results for intermediate input lengths look like steps) > > On Graviton 4 there is still a noticeable difference between the proposed implementation and C2 generated code: > > > Benchmark (digesterName) (length) Pct > MessageDigests.digest SHA3-256 64 8.3% > MessageDigests.digest SHA3-256 16384 11% > MessageDigests.digest SHA3-512 64 8.4% > MessageDigests.digest SHA3-512 16384 11.5% > MessageDigests.getAndDigest SHA3-256 64 7.2% > MessageDigests.getAndDigest SHA3-256 16384 11% > MessageDigests.getAndDigest SHA3-512 64 7.3% > MessageDigests.getAndDigest SHA3-512 16384 11.6% > > > and the version that uses the extension is ~1.8x slower than C2 > > Existing intrinsic implementation is put under a flag `UseSIMDForSHA3Intrinsic` which is on by default where the intrinsic is enabled currently. > > Sanity tests were modified to cover new intrinsic variants (`-XX:-UseSIMDForSHA3Intrinsic -XX:+-PreserveFramePointer`) on aarch64 hw. Existing test cases where intrinsic is enabled are executed with `-XX:+IgnoreUnrecognizedVMOptions -XX:+UseSIMDForSHA3Intrinsic`, on platforms where the sha3 extension is missing they still are cut off by isSHA3IntrinsicAvailable() predicate. > > The original PR https://github.com/openjdk/jdk/pull/20422 has been auto-closed and the branch has been re-created on top of the new master. Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: No imm masking in rolw ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24260/files - new: https://git.openjdk.org/jdk/pull/24260/files/cd24df67..d9cf5135 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24260&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24260&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24260.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24260/head:pull/24260 PR: https://git.openjdk.org/jdk/pull/24260 From lmesnik at openjdk.org Tue Jun 3 16:46:55 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 3 Jun 2025 16:46:55 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan Thank you for implementing exclusion this way. I'll approve PR once you address feedback about commenting. ------------- PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2893319737 From cslucas at openjdk.org Tue Jun 3 17:06:56 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 17:06:56 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 08:40:41 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/code/nmethod.hpp line 498: >> >>> 496: >>> 497: >>> 498: static const char* NMethodChangeReason_to_string(NMethodChangeReason reason) { >> >> Uh, use a switch: >> >> >> switch(reason) { >> case C1_deoptimize: return "C1 deoptimized"; >> case C1_codepatch: return "C1 code patch"; >> ... >> default: >> assert(false, "Unhandled reason"); >> return "Unknown"; >> } > > Also, names: `change_reason_to_string(ChangeReason reason)`. Now that enum is scoped to `nmethod`, there is no need for `NMethod` prefix. Makes sense, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2124446618 From dchuyko at openjdk.org Tue Jun 3 17:09:57 2025 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Tue, 3 Jun 2025 17:09:57 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v4] In-Reply-To: References: <47P15HTCeTU93mVEKekG-smYjt5ebvSMJ8bgbG28vEI=.5f49753a-7ff5-4154-80e2-cd4fc996119f@github.com> Message-ID: On Tue, 3 Jun 2025 14:22:10 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 331: >> >>> 329: >>> 330: inline void rol(Register Rd, Register Rn, unsigned imm) { >>> 331: extr(Rd, Rn, Rn, ((64 - imm) & 63)); >> >> Suggestion: >> >> extr(Rd, Rn, Rn, (64 - imm)); >> >> It's better to catch an out-of-range immediate value. > > `rolw` too. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24260#discussion_r2124451203 From kvn at openjdk.org Tue Jun 3 17:18:00 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 17:18:00 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` Good. This will be needed for AOT caching Level2 C1 compiled nmethods which have profiling: https://github.com/vnkozlov/jdk/commit/46595236a88a90908a7a54e4c6bb872d634be441 ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25599#pullrequestreview-2893406165 From shade at openjdk.org Tue Jun 3 17:49:41 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 17:49:41 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v2] In-Reply-To: References: Message-ID: <-wJp4hkfj1YBQ4C_UjhsqBE2UkrbOafMr_bs_-v7S-A=.9801fb39-6372-4803-bbfe-fa5c3fe9ad3f@github.com> On Tue, 3 Jun 2025 10:47:25 GMT, Vladimir Kozlov wrote: >> There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. >> >> I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. >> >> I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. >> >> Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. >> >> Testing tier1-3, xcomp, stress. Higher tiers are still running. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused argument src/hotspot/share/runtime/sharedRuntime.cpp line 2227: > 2225: } > 2226: > 2227: static int compute_size(int total_args_passed) { OK, but if the source of discrepancy is between two places computing stuff separately (inconsistently), do you want to make the computations mechanically the same? Something like: static int compute_size_in_words(int total_args_passed) { return (int)heap_word_size(sizeof(AdapterFingerPrint) + (length(total_args_passed) * sizeof(int))); } static int compute_size_in_bytes(int total_args_passed) { return compute_size_in_words(total_args_passed) * BytesPerWord; } Then use `compute_size_in_words()` in the other place: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25604#discussion_r2124505187 From cslucas at openjdk.org Tue Jun 3 17:52:04 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 17:52:04 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v3] In-Reply-To: References: Message-ID: > Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. > > Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Address PR feedback: more refactoring / renamings ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25338/files - new: https://git.openjdk.org/jdk/pull/25338/files/b3bb4365..fa77be5c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=01-02 Stats: 102 lines in 15 files changed: 11 ins; 31 del; 60 mod Patch: https://git.openjdk.org/jdk/pull/25338.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25338/head:pull/25338 PR: https://git.openjdk.org/jdk/pull/25338 From shade at openjdk.org Tue Jun 3 17:52:05 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 17:52:05 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v3] In-Reply-To: References: Message-ID: <195pScH-Kh1H1JvhkwC9xLM_joDJPccyMve95BwIlzk=.22181903-b4af-4ec5-8d57-688a6ee51832@github.com> On Tue, 3 Jun 2025 17:44:30 GMT, Cesar Soares Lucas wrote: >> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. >> >> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Address PR feedback: more refactoring / renamings Almost there, more cosmetics. src/hotspot/share/code/nmethod.hpp line 500: > 498: static const char* change_reason_to_string(ChangeReason change_reason) { > 499: switch (change_reason) { > 500: case ChangeReason::C1_codepatch: return "C1 code patch"; Indenting: should be two spaces everywhere. Also, I think this kind of indenting forces us to re-align the switch for the largest enum label. Let's just break them. Plus, any multi-line blocks should be braced. So, in total: switch (change_reason) { case ChangeReason::C1_codepatch: return "C1 code patch"; ... default: { assert(false, "Unhandled reason"); return "Unknown"; } } src/hotspot/share/code/nmethod.hpp line 691: > 689: // another thread performed the transition. > 690: bool make_not_entrant(ChangeReason change_reason); > 691: bool make_not_used() { return make_not_entrant(ChangeReason::not_used); } Suggestion: bool make_not_used() { return make_not_entrant(ChangeReason::not_used); } ------------- PR Review: https://git.openjdk.org/jdk/pull/25338#pullrequestreview-2893478002 PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2124511959 PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2124514702 From kvn at openjdk.org Tue Jun 3 17:58:48 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 17:58:48 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v2] In-Reply-To: <-wJp4hkfj1YBQ4C_UjhsqBE2UkrbOafMr_bs_-v7S-A=.9801fb39-6372-4803-bbfe-fa5c3fe9ad3f@github.com> References: <-wJp4hkfj1YBQ4C_UjhsqBE2UkrbOafMr_bs_-v7S-A=.9801fb39-6372-4803-bbfe-fa5c3fe9ad3f@github.com> Message-ID: <-yXLmyKeGt7ajGu_p3QgKPD2fD-2uSd7c8Hz-MHINMA=.504c9dc9-23ae-447d-8f26-74fc5693f30a@github.com> On Tue, 3 Jun 2025 17:37:49 GMT, Aleksey Shipilev wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unused argument > > src/hotspot/share/runtime/sharedRuntime.cpp line 2227: > >> 2225: } >> 2226: >> 2227: static int compute_size(int total_args_passed) { > > OK, but if the source of discrepancy is between two places computing stuff separately (inconsistently), do you want to make the computations mechanically the same? > > Something like: > > > static int compute_size_in_words(int total_args_passed) { > return (int)heap_word_size(sizeof(AdapterFingerPrint) + (length(total_args_passed) * sizeof(int))); > } > > static int compute_size_in_bytes(int total_args_passed) { > return compute_size_in_words(total_args_passed) * BytesPerWord; > } > > > Then use `compute_size_in_words()` in the other place: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421 Yes, I can do that. But I will pass _length which is different from total_args_passed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25604#discussion_r2124538592 From iveresov at openjdk.org Tue Jun 3 18:07:25 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 3 Jun 2025 18:07:25 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder Message-ID: Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. ------------- Commit messages: - Cleanup with KlassTrainingData constructor Changes: https://git.openjdk.org/jdk/pull/25623/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358003 Stats: 17 lines in 1 file changed: 0 ins; 13 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25623.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25623/head:pull/25623 PR: https://git.openjdk.org/jdk/pull/25623 From shade at openjdk.org Tue Jun 3 18:07:26 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 18:07:26 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder In-Reply-To: References: Message-ID: <0BmSTgFVR4bDzT_UBDHac675eWlfGA6XmIIs_QO-pUY=.a0f73638-cc74-4441-a03a-6db66bd12ea0@github.com> On Tue, 3 Jun 2025 17:36:13 GMT, Igor Veresov wrote: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. Makes sense. I was dumbfounded what was "previous handle", when we are in constructor. I suspected it was something about placement-new code somewhere. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25623#pullrequestreview-2893523002 From iveresov at openjdk.org Tue Jun 3 18:07:26 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 3 Jun 2025 18:07:26 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 17:36:13 GMT, Igor Veresov wrote: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. mach5 testing in progress, will report back once it's done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2936442642 From kvn at openjdk.org Tue Jun 3 18:17:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 18:17:29 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v3] In-Reply-To: References: Message-ID: > There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. > > I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. > > I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. > > Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. > > Testing tier1-3, xcomp, stress. Higher tiers are still running. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Use one compute_size_in_words() method ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25604/files - new: https://git.openjdk.org/jdk/pull/25604/files/9b67ceab..862e7826 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25604&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25604&range=01-02 Stats: 10 lines in 1 file changed: 2 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25604/head:pull/25604 PR: https://git.openjdk.org/jdk/pull/25604 From kvn at openjdk.org Tue Jun 3 18:17:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 18:17:29 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v2] In-Reply-To: <-yXLmyKeGt7ajGu_p3QgKPD2fD-2uSd7c8Hz-MHINMA=.504c9dc9-23ae-447d-8f26-74fc5693f30a@github.com> References: <-wJp4hkfj1YBQ4C_UjhsqBE2UkrbOafMr_bs_-v7S-A=.9801fb39-6372-4803-bbfe-fa5c3fe9ad3f@github.com> <-yXLmyKeGt7ajGu_p3QgKPD2fD-2uSd7c8Hz-MHINMA=.504c9dc9-23ae-447d-8f26-74fc5693f30a@github.com> Message-ID: <6LAxvKdv19aANIrZN-_6AP4NEeTxtxLfgCe7HEWHmc8=.505510c7-f16e-471a-a0fe-e71c76e6b77b@github.com> On Tue, 3 Jun 2025 17:56:29 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/runtime/sharedRuntime.cpp line 2227: >> >>> 2225: } >>> 2226: >>> 2227: static int compute_size(int total_args_passed) { >> >> OK, but if the source of discrepancy is between two places computing stuff separately (inconsistently), do you want to make the computations mechanically the same? >> >> Something like: >> >> >> static int compute_size_in_words(int total_args_passed) { >> return (int)heap_word_size(sizeof(AdapterFingerPrint) + (length(total_args_passed) * sizeof(int))); >> } >> >> static int compute_size_in_bytes(int total_args_passed) { >> return compute_size_in_words(total_args_passed) * BytesPerWord; >> } >> >> >> Then use `compute_size_in_words()` in the other place: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421 > > Yes, I can do that. But I will pass _length which is different from total_args_passed. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25604#discussion_r2124572304 From shade at openjdk.org Tue Jun 3 18:29:18 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 18:29:18 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v3] In-Reply-To: References: Message-ID: <6X52iTWdyHqJ9izTAPWWudbuW8Qo3LjkPTkJVa20eeY=.e0af77f5-edac-4601-bc38-97d05a8cf96b@github.com> On Tue, 3 Jun 2025 18:17:29 GMT, Vladimir Kozlov wrote: >> There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. >> >> I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. >> >> I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. >> >> Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. >> >> Testing tier1-3, xcomp, stress. Higher tiers are still running. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Use one compute_size_in_words() method Looks okay. Asserts get a bit tautological, but it is pleasantly paranoid for my taste. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25604#pullrequestreview-2893631876 From amenkov at openjdk.org Tue Jun 3 18:49:28 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 3 Jun 2025 18:49:28 GMT Subject: Integrated: 8357650: ThreadSnapshot to take snapshot of thread for thread dumps In-Reply-To: References: Message-ID: On Sat, 24 May 2025 00:17:26 GMT, Alex Menkov wrote: > This is first (hotspot) part of the update for `HotSpotDiagnosticMXBean.dumpThreads` and `jcmd Thread.dump_to_file` to include lock information in thread dumps (JDK-8356870). > The update has been split into parts to simplify reviewing. > The fix contains an implementation of `jdk.internal.vm.ThreadSnapshot` class to gather required information about a thread. > Second (dependent) part includes changes in `HotSpotDiagnosticMXBean.dumpThreads`/`jcmd Thread.dump_to_file`, spec updates and tests for the functionality. > > Testing: new `HotSpotDiagnosticMXBean.dumpThreads`/`jcmd Thread.dump_to_file` functionality was tested in loom repo; > sanity tier1 (this fix only) This pull request has now been integrated. Changeset: 406f1bc5 Author: Alex Menkov URL: https://git.openjdk.org/jdk/commit/406f1bc5b94408778063b885cdac807fd1501e44 Stats: 716 lines in 10 files changed: 712 ins; 0 del; 4 mod 8357650: ThreadSnapshot to take snapshot of thread for thread dumps Co-authored-by: Alan Bateman Co-authored-by: Alex Menkov Reviewed-by: sspitsyn, kevinw ------------- PR: https://git.openjdk.org/jdk/pull/25425 From cslucas at openjdk.org Tue Jun 3 18:52:34 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 18:52:34 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v4] In-Reply-To: References: Message-ID: > Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. > > Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Fix spacing, fix build. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25338/files - new: https://git.openjdk.org/jdk/pull/25338/files/fa77be5c..dc3aa2c1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=02-03 Stats: 50 lines in 2 files changed: 23 ins; 0 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/25338.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25338/head:pull/25338 PR: https://git.openjdk.org/jdk/pull/25338 From cslucas at openjdk.org Tue Jun 3 18:52:35 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 18:52:35 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v3] In-Reply-To: <195pScH-Kh1H1JvhkwC9xLM_joDJPccyMve95BwIlzk=.22181903-b4af-4ec5-8d57-688a6ee51832@github.com> References: <195pScH-Kh1H1JvhkwC9xLM_joDJPccyMve95BwIlzk=.22181903-b4af-4ec5-8d57-688a6ee51832@github.com> Message-ID: On Tue, 3 Jun 2025 17:41:35 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Address PR feedback: more refactoring / renamings > > src/hotspot/share/code/nmethod.hpp line 500: > >> 498: static const char* change_reason_to_string(ChangeReason change_reason) { >> 499: switch (change_reason) { >> 500: case ChangeReason::C1_codepatch: return "C1 code patch"; > > Indenting: should be two spaces everywhere. Also, I think this kind of indenting forces us to re-align the switch for the largest enum label. Let's just break them. Plus, any multi-line blocks should be braced. So, in total: > > > switch (change_reason) { > case ChangeReason::C1_codepatch: > return "C1 code patch"; > ... > default: { > assert(false, "Unhandled reason"); > return "Unknown"; > } > } Done, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2124655548 From shade at openjdk.org Tue Jun 3 18:55:19 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 18:55:19 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 18:52:34 GMT, Cesar Soares Lucas wrote: >> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. >> >> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix spacing, fix build. src/hotspot/share/code/nmethod.cpp line 1971: > 1969: if (xtty != nullptr) { > 1970: ttyLocker ttyl; // keep the following output all in one block > 1971: xtty->begin_elem("make_not_entrant thread='%zu' change_reason='%s'", Wait, let's not change the actual key here. This is part of XML logging, AFAICS, so this might break some tools. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2124657484 From kvn at openjdk.org Tue Jun 3 18:59:16 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 18:59:16 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v3] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 18:17:29 GMT, Vladimir Kozlov wrote: >> There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. >> >> I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. >> >> I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. >> >> Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. >> >> Testing tier1-3, xcomp, stress. Higher tiers are still running. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Use one compute_size_in_words() method Thank you, Aleksey ------------- PR Comment: https://git.openjdk.org/jdk/pull/25604#issuecomment-2936747055 From kvn at openjdk.org Tue Jun 3 19:33:17 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 19:33:17 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v3] In-Reply-To: References: Message-ID: <0lJeZV-WWYPigLaDj2bmwub-s9WzPwyEKgm2PfDatXA=.5267dfd2-08b9-40c1-8602-f153bd18b6b8@github.com> On Tue, 3 Jun 2025 18:17:29 GMT, Vladimir Kozlov wrote: >> There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. >> >> I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. >> >> I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. >> >> Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. >> >> Testing tier1-3, xcomp, stress. Higher tiers are still running. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Use one compute_size_in_words() method Waiting confirmation from @MBaesken . ------------- PR Comment: https://git.openjdk.org/jdk/pull/25604#issuecomment-2936868129 From cslucas at openjdk.org Tue Jun 3 19:33:57 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 19:33:57 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v5] In-Reply-To: References: Message-ID: > Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. > > Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Revert change to attribute of make_not_entrant element ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25338/files - new: https://git.openjdk.org/jdk/pull/25338/files/dc3aa2c1..6af59591 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25338&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25338.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25338/head:pull/25338 PR: https://git.openjdk.org/jdk/pull/25338 From cslucas at openjdk.org Tue Jun 3 19:33:57 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 19:33:57 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 18:47:43 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix spacing, fix build. > > src/hotspot/share/code/nmethod.cpp line 1971: > >> 1969: if (xtty != nullptr) { >> 1970: ttyLocker ttyl; // keep the following output all in one block >> 1971: xtty->begin_elem("make_not_entrant thread='%zu' change_reason='%s'", > > Wait, let's not change the actual key here. This is part of XML logging, AFAICS, so this might break some tools. Sure, I'll revert that. I thought it would be "fine" to change the key here since it was added not "long ago.." ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25338#discussion_r2124736640 From vlivanov at openjdk.org Tue Jun 3 19:57:15 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 3 Jun 2025 19:57:15 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 17:36:13 GMT, Igor Veresov wrote: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25623#pullrequestreview-2893913507 From shade at openjdk.org Tue Jun 3 20:01:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 20:01:29 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v5] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 19:33:57 GMT, Cesar Soares Lucas wrote: >> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. >> >> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Revert change to attribute of make_not_entrant element Looks good to me. Compiler folks might want to ack as well. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25338#pullrequestreview-2893925727 From mgronlun at openjdk.org Tue Jun 3 21:00:36 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 21:00:36 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 14:09:29 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Rename autoadapt src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 31: > 29: > 30: class JavaThread; > 31: class NonJavaThread; NonJavaThread fwd not needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124913821 From iveresov at openjdk.org Tue Jun 3 21:11:26 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 3 Jun 2025 21:11:26 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 17:36:13 GMT, Igor Veresov wrote: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. It seems like we don't need these release_stores either since the constructor is always run under a lock. I'll run some testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2937211273 From mgronlun at openjdk.org Tue Jun 3 21:11:32 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 21:11:32 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 14:09:29 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Rename autoadapt src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 79: > 77: > 78: bool JfrCPUTimeTraceQueue::enqueue(JfrCPUTimeSampleRequest& request) { > 79: assert(JavaThread::current()->jfr_thread_local()->is_cpu_time_jfr_enqueue_locked(), "invariant"); What is preventing another thread from enqueuing a request here? We only know it holds a thread-local lock? Let's make it explicit at this site that the current queue corresponds to the thread-local queue for the current thread. + assert(&JavaThread::current()->jfr_thread_local()->cpu_time_jfr_queue() == this, "invariant"); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124934207 From mgronlun at openjdk.org Tue Jun 3 21:18:34 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 21:18:34 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 14:09:29 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Rename autoadapt src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 96: > 94: } > 95: > 96: volatile u4 _lost_samples_sum = 0; static volatile ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124941941 From mgronlun at openjdk.org Tue Jun 3 21:45:29 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 21:45:29 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: <1RLKF0E-I7CjQRNUqb7k0mEIvoSCO010FUaKnmLVPSI=.4c6876fb-ac9c-47d2-8379-ccafdbdbaabe@github.com> On Tue, 3 Jun 2025 14:09:29 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Rename autoadapt src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 520: > 518: // the sampling period might be too low for the current Linux configuration > 519: // so samples might be skipped and we have to compute the actual period > 520: int64_t period = get_sampling_period() * (info->si_overrun + 1); Does this calculation have to be done on every signal, by every thread? It seems like something that could be precalculated when the period is set? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124980054 From duke at openjdk.org Tue Jun 3 21:52:22 2025 From: duke at openjdk.org (duke) Date: Tue, 3 Jun 2025 21:52:22 GMT Subject: Withdrawn: 8344116: C2: remove slice parameter from LoadNode::make In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 15:18:25 GMT, Zihao Lin wrote: > This patch remove slice parameter from LoadNode::make > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24258 From mgronlun at openjdk.org Tue Jun 3 22:01:34 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 22:01:34 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 14:09:29 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Rename autoadapt src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 360: > 358: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); > 359: if (lock) { > 360: tl->acquire_cpu_time_jfr_dequeue_lock(); This is your synchronization point on return from native code, which is effectively a spinlock. This can cause problems when a large number of threads are being processed by the "do_async_processing" request call. We should fix this as a bug after integration (use a proper Monitor as a synchronization point). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124997140 From mgronlun at openjdk.org Tue Jun 3 22:04:27 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 3 Jun 2025 22:04:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 14:09:29 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Rename autoadapt src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 143: > 141: JavaThread *const jt = JavaThread::cast(t); > 142: send_java_thread_start_event(jt); > 143: JfrCPUTimeThreadSampling::on_javathread_create(jt); Move before send_java_thread...to have that captured by the timer? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125004892 From iveresov at openjdk.org Tue Jun 3 22:13:01 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 3 Jun 2025 22:13:01 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v2] In-Reply-To: References: Message-ID: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: No need for release_store() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25623/files - new: https://git.openjdk.org/jdk/pull/25623/files/f9b133fe..85a71619 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25623.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25623/head:pull/25623 PR: https://git.openjdk.org/jdk/pull/25623 From coleenp at openjdk.org Tue Jun 3 22:40:17 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 3 Jun 2025 22:40:17 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 22:13:01 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > No need for release_store() src/hotspot/share/oops/trainingData.cpp line 437: > 435: assert(klass != nullptr, ""); > 436: Handle hm(JavaThread::current(), klass->java_mirror()); > 437: jobject hmj = JNIHandles::make_global(hm); Why don't you use OopStorage for this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25623#discussion_r2125056459 From iveresov at openjdk.org Tue Jun 3 22:48:14 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 3 Jun 2025 22:48:14 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 22:13:01 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > No need for release_store() Testing is ok ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2937546623 From iveresov at openjdk.org Tue Jun 3 22:55:16 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 3 Jun 2025 22:55:16 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v2] In-Reply-To: References: Message-ID: <4ne8DsOBEMC2jSdOBI4l_33Jrs0CXHEKpdrLlBB-2uM=.52428bbb-6abc-4c33-85e7-6aa424c8b4f7@github.com> On Tue, 3 Jun 2025 22:37:54 GMT, Coleen Phillimore wrote: >> Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: >> >> No need for release_store() > > src/hotspot/share/oops/trainingData.cpp line 437: > >> 435: assert(klass != nullptr, ""); >> 436: Handle hm(JavaThread::current(), klass->java_mirror()); >> 437: jobject hmj = JNIHandles::make_global(hm); > > Why don't you use OopStorage for this? Are there any advantages? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25623#discussion_r2125071202 From iklam at openjdk.org Tue Jun 3 23:35:21 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 3 Jun 2025 23:35:21 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v3] In-Reply-To: References: Message-ID: <1tLFEk_8m434FPnlObCMhoxgxuYc12pSkfZOivpE--0=.b80134c8-aa3f-4c02-827e-f24ed208d08c@github.com> On Tue, 3 Jun 2025 18:17:29 GMT, Vladimir Kozlov wrote: >> There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. >> >> I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. >> >> I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. >> >> Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. >> >> Testing tier1-3, xcomp, stress. Higher tiers are still running. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Use one compute_size_in_words() method Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25604#pullrequestreview-2894383374 From kvn at openjdk.org Wed Jun 4 00:03:16 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Jun 2025 00:03:16 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v3] In-Reply-To: <1tLFEk_8m434FPnlObCMhoxgxuYc12pSkfZOivpE--0=.b80134c8-aa3f-4c02-827e-f24ed208d08c@github.com> References: <1tLFEk_8m434FPnlObCMhoxgxuYc12pSkfZOivpE--0=.b80134c8-aa3f-4c02-827e-f24ed208d08c@github.com> Message-ID: On Tue, 3 Jun 2025 23:32:58 GMT, Ioi Lam wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Use one compute_size_in_words() method > > Marked as reviewed by iklam (Reviewer). Thank you, @iklam ------------- PR Comment: https://git.openjdk.org/jdk/pull/25604#issuecomment-2937775069 From iveresov at openjdk.org Wed Jun 4 00:53:21 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 00:53:21 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v3] In-Reply-To: References: Message-ID: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. Igor Veresov has updated the pull request incrementally with two additional commits since the last revision: - More changes - Use dedicated OopStorage ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25623/files - new: https://git.openjdk.org/jdk/pull/25623/files/85a71619..f8a9b4a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=01-02 Stats: 32 lines in 4 files changed: 20 ins; 8 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25623.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25623/head:pull/25623 PR: https://git.openjdk.org/jdk/pull/25623 From iveresov at openjdk.org Wed Jun 4 00:56:17 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 00:56:17 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v2] In-Reply-To: <4ne8DsOBEMC2jSdOBI4l_33Jrs0CXHEKpdrLlBB-2uM=.52428bbb-6abc-4c33-85e7-6aa424c8b4f7@github.com> References: <4ne8DsOBEMC2jSdOBI4l_33Jrs0CXHEKpdrLlBB-2uM=.52428bbb-6abc-4c33-85e7-6aa424c8b4f7@github.com> Message-ID: On Tue, 3 Jun 2025 22:52:26 GMT, Igor Veresov wrote: >> src/hotspot/share/oops/trainingData.cpp line 437: >> >>> 435: assert(klass != nullptr, ""); >>> 436: Handle hm(JavaThread::current(), klass->java_mirror()); >>> 437: jobject hmj = JNIHandles::make_global(hm); >> >> Why don't you use OopStorage for this? > > Are there any advantages? Ok, transitioned to OopStrage. Please take a look if correctly. I'll be back when the testing is done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25623#discussion_r2125208972 From kvn at openjdk.org Wed Jun 4 02:17:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Jun 2025 02:17:29 GMT Subject: Integrated: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 02:01:02 GMT, Vladimir Kozlov wrote: > There is difference between AdapterFingerPrint allocation size [compute_size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2227) which may not be aligned to HeapWord size and [size](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/sharedRuntime.cpp#L2421) used for copying during AOT cache build which is aligned and can be bigger than allocation size. > > I added asserts to `AdapterFingerPrint` and `AdapterHandlerEntry` to make sure sizes are correct. Both are used in AOT cache build. > > I also moved `FreeHeap()` from `~AdapterFingerPrint()` to enforce the comment and simplify executed code. > > Thanks to @MBaesken for finding the issue and @iklam for pointing the cause. > > Testing tier1-3, xcomp, stress. Higher tiers are still running. This pull request has now been integrated. Changeset: ebd85288 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/ebd85288ce309b7dc7ff8b36558dd9f2a2300209 Stats: 15 lines in 2 files changed: 5 ins; 1 del; 9 mod 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder Reviewed-by: shade, iklam, asmehra ------------- PR: https://git.openjdk.org/jdk/pull/25604 From apangin at openjdk.org Wed Jun 4 03:15:36 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 03:15:36 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> On Tue, 3 Jun 2025 14:09:29 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Rename autoadapt src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 51: > 49: static bool is_excluded(JavaThread* jt) { > 50: return jt->is_hidden_from_external_view() || > 51: jt->jfr_thread_local()->is_excluded() || These restrictions cause a large blind spot in observability. There is no technical limitation for recording cpu samples for internal threads too, even without a Java stack trace. Consider removing this restriction, although not in this PR. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 135: > 133: while ((new_lost_samples = Atomic::cmpxchg(&_lost_samples, lost_samples, 0)) != lost_samples) { > 134: lost_samples = new_lost_samples; > 135: } Why not `Atomic::xchg`? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 161: > 159: return 0; > 160: } > 161: return os::active_processor_count() * 1000000000.0 / rate; If sampling period is configured as an absolute number in milliseconds, this value must be passed as is. Double conversion via `Runtime.availableProcessors()` / `active_processor_count()` is unobvious and error-prone. First, because of asymmetry: e.g. `Runtime.availableProcessors()` may be redefined by an agent so that its value is not aligned with `active_processor_count()`. Second, because number of available processors may change at runtime, e.g., by adjusting cgroup quotas. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 198: > 196: virtual void post_run(); > 197: public: > 198: virtual const char* name() const { return "JFR CPU Time Thread Sampler"; } Thread name is too long and does not sound right. Logically, it is not "Thread Sampler", but rather "Sampler Thread", which also aligns with the existing "JFR Sampler Thread". But I'd simplify it to `JFR CPU Time Sampler` or maybe `JFR CPU Sampler Thread`. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 202: > 200: void run(); > 201: void on_javathread_create(JavaThread* thread); > 202: bool create_timer_for_thread(JavaThread* thread, timer_t &timerid); Should it be `private`? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 252: > 250: timer_delete(*timer); > 251: tl->unset_cpu_timer(); > 252: tl->deallocate_cpu_time_jfr_queue(); Either this line is not needed or there is a possible resource leak: if `create_timer_for_thread` fails, queue is allocated but not deallocated. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 281: > 279: stop_timer(); > 280: Atomic::store(&_stop_signals, true); > 281: while (Atomic::load_acquire(&_active_signal_handlers) > 0) { There can be a race when `handle_timer_signal` has already passed `_stop_signals` check but has not yet incremented `_active_signal_handlers`. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 308: > 306: > 307: if (Atomic::load_acquire(&_is_async_processing_of_cpu_time_jfr_requests_triggered)) { > 308: Atomic::release_store(&_is_async_processing_of_cpu_time_jfr_requests_triggered, false); acquire/release seem to be used for no good reason. Also, this could be a single `cmpxchg`. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 326: > 324: if (jt->thread_state() != _thread_in_native || !tl->try_acquire_cpu_time_jfr_dequeue_lock()) { > 325: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); > 326: continue; // thread doesn't have a last Java frame or queue is already being processed This comment may sound confusing, since `has_last_Java_frame` is checked separately below. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 472: > 470: > 471: void handle_timer_signal(int signo, siginfo_t* info, void* context) { > 472: assert(_instance != nullptr, "invariant"); There can be an arbitrary delay in async signal delivery. It's unlikely, but not impossible for `_instance` to be deleted by the time signal handler is called. There should be a better way to synchronize with JFR shutdown. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 477: > 475: > 476: > 477: void JfrCPUTimeThreadSampling::handle_timer_signal(siginfo_t* info, void* context) { It may be a good idea to validate `info->si_code` in order to protect from things like `kill -SIGPROF` after profiling has stopped. For a similar reason, `_sampler->_stop_signals` should default to `true` whenever profiler is not running. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 509: > 507: JfrCPUTimeTraceQueue& queue = tl->cpu_time_jfr_queue(); > 508: if (!check_state(jt)) { > 509: queue.increment_lost_samples(); nit: wrong indent src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 576: > 574: } > 575: if (timer_create(clock, &sev, &t) < 0) { > 576: log_error(jfr)("Failed to register the signal handler for thread sampling: %s", os::strerror(os::get_last_error())); If an application has many threads and current RLIMIT_SIGPENDING is low, logs will be flooded with this error message. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 606: > 604: void JfrCPUTimeThreadSampler::init_timers() { > 605: // install sig handler for sig > 606: PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal); SIGPROF is also used by external profilers. Need to check if SIGPROF handler is already installed and warn user. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 58: > 56: volatile u4 _head; > 57: > 58: volatile s4 _lost_samples; Why signed int? Can it be negative? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 128: > 126: static void send_lost_event(const JfrTicks& time, traceid tid, s4 lost_samples); > 127: > 128: // Trigger sampling while a thread is not in a safepoint, from a seperate thread typo: separate src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 558: > 556: void JfrThreadLocal::set_cpu_timer(timer_t* timer) { > 557: if (_cpu_timer == nullptr) { > 558: _cpu_timer = JfrCHeapObj::new_array(1); `timer_t` is a primitive type, at most one machine word. Why extra indirection and allocation? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124528320 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124503100 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125157311 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125128723 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125130332 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125190998 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125203700 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125342289 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125249171 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125230422 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125241099 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125320255 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125411074 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125430231 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2124507884 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125333563 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125183931 From lliu at openjdk.org Wed Jun 4 03:30:11 2025 From: lliu at openjdk.org (Liming Liu) Date: Wed, 4 Jun 2025 03:30:11 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v2] In-Reply-To: References: Message-ID: > This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. > > 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. > > 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. > > The performance regressions and improvements were measured with the following microbenchmarks: > org.openjdk.bench.java.util.TestCRC32.testCRC32Update > org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate > > Ran the following JTReg tests on Ampere1 and did not find problems: > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Make it be a diagnostic flag ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25609/files - new: https://git.openjdk.org/jdk/pull/25609/files/8aa96578..db926eb0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25609.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25609/head:pull/25609 PR: https://git.openjdk.org/jdk/pull/25609 From dholmes at openjdk.org Wed Jun 4 04:53:27 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 4 Jun 2025 04:53:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v10] In-Reply-To: <1JqKzjCGoZ9N_ez_gMKOlR1lbWPte0LkQS3bSb81ua0=.3c4c006b-18c0-4444-a867-8c774899b5b9@github.com> References: <8ESOaNI_qHLzLquiZT7RZR43lit-o8_5rTky1nJFjH4=.a81b8882-1470-4f76-8c9a-cdc2a7b50070@github.com> <1JqKzjCGoZ9N_ez_gMKOlR1lbWPte0LkQS3bSb81ua0=.3c4c006b-18c0-4444-a867-8c774899b5b9@github.com> Message-ID: On Tue, 3 Jun 2025 12:16:32 GMT, Johannes Bechberger wrote: >>> Hold on, shouldn't this really be "Lost"? @egahlin and @mgronlun need to chime in here. >> >> Lost might be better. >> >> I wonder if `` is needed, instead of thread = true? > >> I wonder if is needed, instead of thread = true? > > We had these discussions before on the old PR and then decided to end up with eventThread (as the other events do to), @parttimenerd I would really like to see some kind of design description for this which explains what the threading model is, how the signals are used, and how all the pieces interact. Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2938512151 From mbaesken at openjdk.org Wed Jun 4 05:21:17 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 05:21:17 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: References: <_p5h0MfOc1LQ2g30xDHYJf9v_B2QbJmJ0El0vc_u6zM=.6461af5c-c4cd-442c-a16e-c9578484f10c@github.com> Message-ID: On Tue, 3 Jun 2025 05:57:46 GMT, Axel Boldt-Christmas wrote: >> [JDK-8358310](https://bugs.openjdk.org/browse/JDK-8358310) / #25578 is open right now as a quick fix for returning a too large value without cleaning up the implementation. (As a fix for 25) >> >> This was noted back in https://github.com/openjdk/jdk/pull/18941#issuecomment-2079316745 ([JDK-8330275](https://bugs.openjdk.org/browse/JDK-8330275)), but I think fixing this fell through the cracks. >> >> I currently have a rewrite in the works which overhauls the heap base selection, which I plan to get into 26. In that patch all the non-generational legacy is removed. So we no longer probe based on the assumption that we need 3 extra high order bits. > > But I will make sure to create an issue for this overhaul, so it does not get lost. Thanks for this ! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2125665378 From mbaesken at openjdk.org Wed Jun 4 05:27:06 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 05:27:06 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v3] In-Reply-To: References: Message-ID: > Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). > This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. > It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' > This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Add comment requested by mdoerr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25549/files - new: https://git.openjdk.org/jdk/pull/25549/files/82a11f9b..85da86e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25549&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25549&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25549.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25549/head:pull/25549 PR: https://git.openjdk.org/jdk/pull/25549 From mbaesken at openjdk.org Wed Jun 4 05:27:06 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 05:27:06 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v2] In-Reply-To: <0808aEXLDKNUY6rsNCbjRjs_O0BaPLrCsX7q2zjpzus=.8ea987cf-fb41-47c5-9df3-840bc939f99a@github.com> References: <0808aEXLDKNUY6rsNCbjRjs_O0BaPLrCsX7q2zjpzus=.8ea987cf-fb41-47c5-9df3-840bc939f99a@github.com> Message-ID: On Tue, 3 Jun 2025 07:58:26 GMT, Martin Doerr wrote: > I think this PR is ok, but please add a comment like "The max supported value is 44 because of other internal data structures.". Sure, I added the comment. Are you fine with the PR now ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25549#discussion_r2125673231 From jbechberger at openjdk.org Wed Jun 4 05:32:29 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 05:32:29 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 00:13:07 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename autoadapt > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 161: > >> 159: return 0; >> 160: } >> 161: return os::active_processor_count() * 1000000000.0 / rate; > > If sampling period is configured as an absolute number in milliseconds, this value must be passed as is. > Double conversion via `Runtime.availableProcessors()` / `active_processor_count()` is unobvious and error-prone. First, because of asymmetry: e.g. `Runtime.availableProcessors()` may be redefined by an agent so that its value is not aligned with `active_processor_count()`. Second, because number of available processors may change at runtime, e.g., by adjusting cgroup quotas. Is this something for a later PR? > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 281: > >> 279: stop_timer(); >> 280: Atomic::store(&_stop_signals, true); >> 281: while (Atomic::load_acquire(&_active_signal_handlers) > 0) { > > There can be a race when `handle_timer_signal` has already passed `_stop_signals` check but has not yet incremented `_active_signal_handlers`. Amy idea on how to fix it? > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 472: > >> 470: >> 471: void handle_timer_signal(int signo, siginfo_t* info, void* context) { >> 472: assert(_instance != nullptr, "invariant"); > > There can be an arbitrary delay in async signal delivery. > It's unlikely, but not impossible for `_instance` to be deleted by the time signal handler is called. There should be a better way to synchronize with JFR shutdown. Any ideas? Or is it something for a later PR? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125678084 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125680345 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125681876 From jbechberger at openjdk.org Wed Jun 4 06:02:32 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 06:02:32 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v10] In-Reply-To: References: <8ESOaNI_qHLzLquiZT7RZR43lit-o8_5rTky1nJFjH4=.a81b8882-1470-4f76-8c9a-cdc2a7b50070@github.com> <1JqKzjCGoZ9N_ez_gMKOlR1lbWPte0LkQS3bSb81ua0=.3c4c006b-18c0-4444-a867-8c774899b5b9@github.com> Message-ID: <81dLp_39MhU-TuDD3EDt7iTX8HyEDZfj6nvCPwE5Ol4=.7d564cda-b7ae-4623-9705-704770b2b118@github.com> On Wed, 4 Jun 2025 04:50:56 GMT, David Holmes wrote: >>> I wonder if is needed, instead of thread = true? >> >> We had these discussions before on the old PR and then decided to end up with eventThread (as the other events do to), > > @parttimenerd I would really like to see some kind of design description for this which explains what the threading model is, how the signals are used, and how all the pieces interact. Thanks @dholmes-ora I attempt a first version here: The design consists of four main parts: - setup code: This sets up the signal handlers for every new thread and deletes them afterwards - the per-thread signal handlers: They check first that the current thread is valid, increment that they are currently active and check that they shouldn't stop (because the profiler is disabled). Now they acquire the thread-local enqueue lock for the current thread's request queue and push the sampling requests in (see https://openjdk.org/jeps/518 + the current period). It triggers/arms a safepoint. If the current thread is in native, they trigger (set a flag) the asynchronous stackwalking. This prevents long native periods of overflowing the request queue. Finally, the enqueue lock is released. - the safepoint handler: In the safepoint handler, we check if the thread-local queue is not empty. If so, we acquire a dequeue lock and process all entries of the queue, thereby creating JFR events. We also untrigger the async-stack-walking request for the thread. We then release the lock. - the sampler thread: Its task is to regularly update the timers if needed (configuration changes) and to walk the thread list to find any task that wants to be asynchronously stack-walked. For every of these threads, the dequeue lock is acquired (skipping if already set to enqueue) and the queue is processed as at the safepoint. Then the lock is released. On shutdown: Whenever the sampler is shut down, we first set the `_stop_signals` flag to prevent new signal handlers from entering the request creation code (and thereby accessing data structures that we already deallocated), we disable the timers for all threads and then wait till no signal handler is engaged anymore. It is important to note that there is only one thread-local lock used, but it has three states: - enqueue - dequeue - unlocked This prevents these phases from overlapping. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2938677600 From jbechberger at openjdk.org Wed Jun 4 06:10:30 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 06:10:30 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: <1RLKF0E-I7CjQRNUqb7k0mEIvoSCO010FUaKnmLVPSI=.4c6876fb-ac9c-47d2-8379-ccafdbdbaabe@github.com> References: <1RLKF0E-I7CjQRNUqb7k0mEIvoSCO010FUaKnmLVPSI=.4c6876fb-ac9c-47d2-8379-ccafdbdbaabe@github.com> Message-ID: On Tue, 3 Jun 2025 21:42:48 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename autoadapt > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 520: > >> 518: // the sampling period might be too low for the current Linux configuration >> 519: // so samples might be skipped and we have to compute the actual period >> 520: int64_t period = get_sampling_period() * (info->si_overrun + 1); > > Does this calculation have to be done on every signal, by every thread? It seems like something that could be precalculated when the period is set? This might change dynamically, so probably no. Only caching would work, but this is a small optimization for later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125725230 From mbaesken at openjdk.org Wed Jun 4 06:23:17 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 06:23:17 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan gc/arguments/TestUseCompressedOopsFlagsWithUlimit.java --------------------------------------------------------------- stderr: [==46460==ERROR: AddressSanitizer failed to allocate 0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno: 12) ==46460==ReserveShadowMemoryRange failed while trying to map 0xdfff0001000 bytes. Perhaps you're using ulimit -v ulimit clashes with the memory requirements of ASAN runtime/Thread/TestBreakSignalThreadDump.java --------------------------------------------------------------- stderr: [==18432==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD. loading of the jsig lib does currently not work well with ASAN lib runtime/XCheckJniJsig/XCheckJSig.java --------------------------------------------------------------- stderr: [==71228==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD. loading of the jsig lib does currently not work well with ASAN lib runtime/cds/appcds/aotCode/AOTCodeCompressedOopsTest.java --------------------------------------------------------------- reports ==35621==ERROR: AddressSanitizer: heap-buffer-overflow on address ... this will be fixed hopefully so we could maybe remove the !asan tagging serviceability/dcmd/vm/SystemDumpMapTest.java --------------------------------------------------------------- Missing patterns in dump: 0x\\p{XDigit}+-0x\\p{XDigit}+ +\\d+ +[rwsxp-]+ +\\d+ +\\d+ +(4K|8K|16K|64K|2M|16M|64M) +com.*\[heap\] test SystemDumpMapTest.jmx(): failure [410ms] ASAN changes the memory map dump slightly, but the test has rather strict requirements serviceability/dcmd/vm/SystemMapTest.java --------------------------------------------------------------- test SystemMapTest.jmx(): failure [381ms] java.lang.RuntimeException: '0x\\p{XDigit}+-0x\\p{XDigit}+ +\\d+ +[rwsxp-]+ +\\d+ +\\d+ +(4K|8K|16K|64K|2M|16M|64M) +com.*\[heap\]' missing from stdout/stderr ASAN changes the memory map dump slightly, but the test has rather strict requirements serviceability/sa/ClhsdbCDSCore.java --------------------------------------------------------------- Output and diagnostic info for process 45808 was saved into 'pid-45808-output.log' crashOutputString = [[0.028s][error][cds] An error has occurred while processing the shared archive file. Run with -Xlog:aot,cds for details. [0.029s][error][cds] Mismatched values for property jdk.module.addexports: java.base/jdk.internal.misc=ALL-UNNAMED specified during runtime but not during dump time [0.029s][error][cds] Disabling optimized module handling # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x000014d4d60ef8d2, pid=45808, tid=46654 # # JRE version: OpenJDK Runtime Environment (25.0.0.1) (build 25.0.0.1-internal-adhoc.myuser.jdk) # Java VM: OpenJDK 64-Bit Server VM (25.0.0.1-internal-adhoc.myuser.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x3d6b8d2] Unsafe_PutInt+0x592 # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Output doesn't contain the location of core file.: expected true, was false at ClhsdbCDSCore.main(ClhsdbCDSCore.java:171) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) at java.base/java.lang.Thread.run(Thread.java:1474) Caused by: java.lang.RuntimeException: Output doesn't contain the location of core file.: expected true, was false Seems no core was written, maybe ASAN is to blame or my test environment ? serviceability/sa/ClhsdbFindPC.java --------------------------------------------------------------- java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Output doesn't contain the location of core file.: expected true, was false at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:317) at ClhsdbFindPC.main(ClhsdbFindPC.java:339) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1474) Caused by: java.lang.RuntimeException: Output doesn't contain the location of core file.: expected true, was false Looks similar to ClhsdbCDSCore issue Turns out cds/appcds/aotCode/AOTCodeCompressedOopsTest.java was a real bug, so I guess we should remove it from this exclusion. Are you fine with the short explanations given, if yes I would add them as comment to the tests . ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2938728030 PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2938732660 From mbaesken at openjdk.org Wed Jun 4 06:28:17 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 06:28:17 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: <6HSruHtZNPOZJp4vNFnwMns6-_rP_MEHtnnvAP7S5QU=.e91023a2-089c-4541-86a5-ae8d4adeb99d@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <6HSruHtZNPOZJp4vNFnwMns6-_rP_MEHtnnvAP7S5QU=.e91023a2-089c-4541-86a5-ae8d4adeb99d@github.com> Message-ID: On Tue, 3 Jun 2025 12:31:51 GMT, Afshin Zafari wrote: > In ASAN built JDK, some gtests and some other JTREG tests in runtime/ErrorHandling also fail. Do we exclude these in another PR? or should they also be handled/excluded here? The 'some' word in the PR's title is not strict, IMO. Yes it is not strict ; I did mostly tests with ASAN on Linux x86_64 and Linux ppc64le so far . On x86_64 I saw a few more tests have issues with ASAN, but the intention of this PR was to just include the ones where it was clear to me what happens and where a 'fix' is not likely (and mostly also the ones I saw failing across the 2 OS/CPU platforms I mentioned). Maybe that's why we should better remove the exclusion of AOTCodeCompressedOopsTest , because this is not some kind of incompatibility of ASAN with special test requirements, but a real memory issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2938748695 From jbechberger at openjdk.org Wed Jun 4 06:34:29 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 06:34:29 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 05:28:21 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 281: >> >>> 279: stop_timer(); >>> 280: Atomic::store(&_stop_signals, true); >>> 281: while (Atomic::load_acquire(&_active_signal_handlers) > 0) { >> >> There can be a race when `handle_timer_signal` has already passed `_stop_signals` check but has not yet incremented `_active_signal_handlers`. > > Amy idea on how to fix it? I added another _static_stop_signals field which should prevent this. >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 472: >> >>> 470: >>> 471: void handle_timer_signal(int signo, siginfo_t* info, void* context) { >>> 472: assert(_instance != nullptr, "invariant"); >> >> There can be an arbitrary delay in async signal delivery. >> It's unlikely, but not impossible for `_instance` to be deleted by the time signal handler is called. There should be a better way to synchronize with JFR shutdown. > > Any ideas? Or is it something for a later PR? I added another `_static_stop_signals` field which should prevent this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125756115 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125755428 From jbechberger at openjdk.org Wed Jun 4 06:34:30 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 06:34:30 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 00:28:46 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename autoadapt > > src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 558: > >> 556: void JfrThreadLocal::set_cpu_timer(timer_t* timer) { >> 557: if (_cpu_timer == nullptr) { >> 558: _cpu_timer = JfrCHeapObj::new_array(1); > > `timer_t` is a primitive type, at most one machine word. Why extra indirection and allocation? @mgronlun wanted this indirection to move it abstract from implementation details ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125758074 From jbechberger at openjdk.org Wed Jun 4 07:00:51 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 07:00:51 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v43] In-Reply-To: References: Message-ID: <-BGoOClpsfsd4Q8Wq-H57L3tIvoaLGauYtRBEDPO-_w=.97e25e4f-879d-45f6-bd00-ad53e2463a8d@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/ae55610c..55c30aef Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=42 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=41-42 Stats: 87 lines in 6 files changed: 26 ins; 10 del; 51 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From kbarrett at openjdk.org Wed Jun 4 07:12:42 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 4 Jun 2025 07:12:42 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v8] In-Reply-To: References: Message-ID: > Please review this change which adds a native method providing the > implementation of Reference::get. Referece::get is an intrinsic candidate, so > this native method implementation is only used when the intrinsic is not. > > Currently there is intrinsic support by the interpreter, C1, C2, and graal, > which are always used. With this change we can later remove all the > per-platform interpreter intrinsic implementations, and might also remove the > C1 intrinsic implementation. > > Testing: > (1) mach5 tier1-6 normal (so using all the existing intrinsics). > (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - Merge branch 'master' into native-reference-get - make private native Reference.get0 the intrinsic - Merge branch 'master' into native-reference-get - Merge branch 'master' into native-reference-get - use new waitForRefProc, some tidying - Merge branch 'master' into native-reference-get - remove timeout by using waitForReferenceProcessing - make ill-timed gc in non-concurrent case less likely - fix test package use - add package decl to test - ... and 3 more: https://git.openjdk.org/jdk/compare/9578d341...98056a8b ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24315/files - new: https://git.openjdk.org/jdk/pull/24315/files/4387e2fe..98056a8b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=06-07 Stats: 49978 lines in 811 files changed: 26005 ins; 15101 del; 8872 mod Patch: https://git.openjdk.org/jdk/pull/24315.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24315/head:pull/24315 PR: https://git.openjdk.org/jdk/pull/24315 From kbarrett at openjdk.org Wed Jun 4 07:12:42 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 4 Jun 2025 07:12:42 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v6] In-Reply-To: References: <5D6vakt8Q41_YF90LaGoxI0tECxo3hm_fiMCuXrpf-w=.363ecf9a-9421-482d-a101-a7ec1efd8b8e@github.com> <_99Geoayi09Ey7YT7qWw4pjMqbVUNxfKpFBwwI_EbHg=.e81158ae-813c-4015-94d6-4404eb756394@github.com> Message-ID: On Fri, 30 May 2025 19:30:50 GMT, Vladimir Ivanov wrote: >> Much of the point of this change is to let us later remove the interpreter/c1 >> intrinsics for this function. I think what you are saying is that might be >> tricky if `get()` is the intrinsic. So maybe I should just go ahead now with >> making the native `get0()` be the intrinsic. I'll take a look at it and see >> how widespread the renaming changes are. >> >> If `get0()` is the intrinsic, then I think that referenced snippet from the >> Compile ctor can go away? Rather than being changed to refer to the get0 >> intrinsic. > >> If get0() is the intrinsic, then I think that referenced snippet from the > Compile ctor can go away? > > Yes. OK, I've moved the intrinsification to get0. It adds a fair number of files, but the changes are mostly trivial renaming of "get" to "get0". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24315#discussion_r2125849565 From mbaesken at openjdk.org Wed Jun 4 07:23:21 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 07:23:21 GMT Subject: RFR: 8358289: [asan] runtime/cds/appcds/aotCode/AOTCodeFlags.java reports heap-buffer-overflow in ArchiveBuilder [v3] In-Reply-To: <0lJeZV-WWYPigLaDj2bmwub-s9WzPwyEKgm2PfDatXA=.5267dfd2-08b9-40c1-8602-f153bd18b6b8@github.com> References: <0lJeZV-WWYPigLaDj2bmwub-s9WzPwyEKgm2PfDatXA=.5267dfd2-08b9-40c1-8602-f153bd18b6b8@github.com> Message-ID: On Tue, 3 Jun 2025 19:30:45 GMT, Vladimir Kozlov wrote: > Waiting confirmation from @MBaesken . The issue is fixed now! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25604#issuecomment-2938902641 From mgronlun at openjdk.org Wed Jun 4 08:17:31 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 08:17:31 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 06:29:59 GMT, Johannes Bechberger wrote: >> Any ideas? Or is it something for a later PR? > > I added another `_static_stop_signals` field which should prevent this. The _instance is only ever deleted in case a JFR startup attempt fails as part of JfrRecorder::create(). The sampler must have a rate and become enrolled to serve clients (by installing timers). The rate is set post JfrRecorder::create() using the setting system, which implies that _instance != nullptr should be invariant. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125976370 From epeter at openjdk.org Wed Jun 4 08:24:23 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:24:23 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v2] In-Reply-To: References: <4gjCTX5GeYnhLOggsT2koqaeM1DdlJnwcQdSiR-3cZk=.beb2eccc-ac6d-48bb-a828-e58383799ea5@github.com> Message-ID: On Fri, 30 May 2025 18:24:22 GMT, Dmitry Chuyko wrote: >> Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Copyright year >> - Review suggestions >> - Merge master >> - Delete empty line >> - SHA3 GPR intrinsic & tests > > GPR rol, rax and rax1 pseudo instructions were added in MacroAssembler. > > Main loop and "bcax"/Chi parts were extracted as functions. > > Main loop counter was put in fp register with fp decrement and fcmp (this variant does have a positive impact). > > Updated results from Graviton machines (Linux, intrinsic vs C2): > > Benchmark (digesterName) (length) Pct > G2 > MessageDigests.digest SHA3-256 64 +20.8% > MessageDigests.digest SHA3-256 16384 +27.2% > G3 > MessageDigests.digest SHA3-256 64 +12.8% > MessageDigests.digest SHA3-256 16384 +15.7% > G4 > MessageDigests.digest SHA3-256 64 +9.7% > MessageDigests.digest SHA3-256 16384 +13.2% @dchuyko Thanks for working on this! I have quickly scanned the code, and it looks reasonable, though I am not an intrinsics specialist. I'll not run some internal testing, feel free to ping me again in 24h. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24260#issuecomment-2939079640 From epeter at openjdk.org Wed Jun 4 08:24:24 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:24:24 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v7] In-Reply-To: <3hGFnUsyrN809lwWuqr7dyxfoCm0F2ILSB-yJV5Hfvo=.1a2048d4-d131-4354-a629-d75f206dda42@github.com> References: <3hGFnUsyrN809lwWuqr7dyxfoCm0F2ILSB-yJV5Hfvo=.1a2048d4-d131-4354-a629-d75f206dda42@github.com> Message-ID: On Tue, 3 Jun 2025 16:31:08 GMT, Dmitry Chuyko wrote: >> This is an implementation of SHA3 intrinsics for AArch64 that operates GPRs. It follows the Java implementation algorithm but eagerly uses available registers. For example, FP+R18 are used when it's allowed. On simpler cores like RPi3 or Surface Pro it is 23-53% faster than C2 compiled version; on Graviton 3 it is 8-14% faster than C2 compiled version (which is faster than the current intrinsic); on Apple Silicon it is faster than C2 compiled version but slower than the ARMv8.2-SHA intrinsic. Improvements on a particular CPU depend on the input length. For instance, for Graviton 2: >> >> >> Benchmark (ops/ms) (digesterName) (length) G2 >> MessageDigests.digest SHA3-256 64 28.28% >> MessageDigests.digest SHA3-256 16384 53.58% >> MessageDigests.digest SHA3-512 64 27.97% >> MessageDigests.digest SHA3-512 16384 43.90% >> MessageDigests.getAndDigest SHA3-256 64 26.18% >> MessageDigests.getAndDigest SHA3-256 16384 52.82% >> MessageDigests.getAndDigest SHA3-512 64 24.73% >> MessageDigests.getAndDigest SHA3-512 16384 44.31% >> >> >> (results for intermediate input lengths look like steps) >> >> On Graviton 4 there is still a noticeable difference between the proposed implementation and C2 generated code: >> >> >> Benchmark (digesterName) (length) Pct >> MessageDigests.digest SHA3-256 64 8.3% >> MessageDigests.digest SHA3-256 16384 11% >> MessageDigests.digest SHA3-512 64 8.4% >> MessageDigests.digest SHA3-512 16384 11.5% >> MessageDigests.getAndDigest SHA3-256 64 7.2% >> MessageDigests.getAndDigest SHA3-256 16384 11% >> MessageDigests.getAndDigest SHA3-512 64 7.3% >> MessageDigests.getAndDigest SHA3-512 16384 11.6% >> >> >> and the version that uses the extension is ~1.8x slower than C2 >> >> Existing intrinsic implementation is put under a flag `UseSIMDForSHA3Intrinsic` which is on by default where the intrinsic is enabled currently. >> >> Sanity tests were modified to cover new intrinsic variants (`-XX:-UseSIMDForSHA3Intrinsic -XX:+-PreserveFramePointer`) on aarch64 hw. Existing test cases where intrinsic is enabled are executed with `-XX:+IgnoreUnrecognizedVMOptions -XX:+UseSIMDForSHA3Intrinsic`, on platforms where the sha3 extension is missing they still are cut off by isSHA3IntrinsicAvailable() predicate. >> >> The original PR https://github.com/openjdk/jdk/pull/20422 has been auto-closed and the branch has... > > Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: > > No imm masking in rolw A nit: can you please fix the alignment issue in the PR description's benchmark results? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24260#issuecomment-2939083282 From jbechberger at openjdk.org Wed Jun 4 08:21:33 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 08:21:33 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 08:14:11 GMT, Markus Gr?nlund wrote: >> I added another `_static_stop_signals` field which should prevent this. > > The _instance is only ever deleted in case a JFR startup attempt fails as part of JfrRecorder::create(). The sampler must have a rate and become enrolled to serve clients (by installing timers). The rate is set post JfrRecorder::create() using the setting system, which implies that _instance != nullptr should be invariant. Yes, you're right. I'll update the code and combine `_active_signal_handlers` and `_stop_signals` in one, so that a CAS loop prevents `_active_signal_handlers` from being incremented when `_stop_signals` is true. This should solve the other data race. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2125985412 From epeter at openjdk.org Wed Jun 4 08:33:20 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:33:20 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v2] In-Reply-To: References: Message-ID: <5BGf1eIVeMQIaLXIoOvcuQlBiaPeWojv8HAnfuOiW_E=.8c39a6ac-c0f2-40fe-bef3-be0a6bd71c07@github.com> On Wed, 4 Jun 2025 03:30:11 GMT, Liming Liu wrote: >> This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. >> >> 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. >> >> 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. >> >> The performance regressions and improvements were measured with the following microbenchmarks: >> org.openjdk.bench.java.util.TestCRC32.testCRC32Update >> org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate >> >> Ran the following JTReg tests on Ampere1 and did not find problems: >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java > > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Make it be a diagnostic flag @limingliu-ampere Thanks for working on this! ? Generally looks reasonable to me as a non expert in crypto intrinsics. But we definitively need an expert to approve this in the end. I have a few comments below. Also: it would be nice to have a sanity test where you use that new flag. It could also be an additional run in an existing test (that's probably even better). You may want to run it with a few different values, including non-multiple of `128` just to sanity check the alignment correction as well. I don't know how much runtime that would add, so that should be checked before going too crazy. Having different values for the flag helps us to simulate the behavior of other hardware for example, and that can be quite useful in general. What do you think? src/hotspot/cpu/aarch64/globals_aarch64.hpp line 95: > 93: "Minimum size in bytes when Crypto PMULL will be used." \ > 94: "Value must be a multiple of 128.") \ > 95: range(256, max_jint) \ Is it sane to have negative values? If not, use `uintx`... or maybe even just `uint`? src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4335: > 4333: assert_different_registers(crc, buf, len, tmp0, tmp1, tmp2); > 4334: > 4335: subs(tmp0, len, CryptoPmullForCRC32LowLimit); Would it make sense to have another alignment sanity check here? It would be both helpful to make sure nobody later breaks your assumption, and could also be helpful for the reader to see the `128` alignment immediately. ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25609#pullrequestreview-2895780805 PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2125999298 PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2126003055 From mgronlun at openjdk.org Wed Jun 4 08:43:37 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 08:43:37 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 03:07:52 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename autoadapt > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 606: > >> 604: void JfrCPUTimeThreadSampler::init_timers() { >> 605: // install sig handler for sig >> 606: PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal); > > SIGPROF is also used by external profilers. Need to check if SIGPROF handler is already installed and warn user. This is *very* important to have a robust failure mechanism when existing handlers are already installed. Why? JFR can be turned on dynamically from the outside, at any time, during runtime. A lot of agents could have installed their handlers by then. Please describe how you intend to handle the case where someone starts JFR late during runtime and the signal handler cannot be installed. > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 128: > >> 126: static void send_lost_event(const JfrTicks& time, traceid tid, s4 lost_samples); >> 127: >> 128: // Trigger sampling while a thread is not in a safepoint, from a seperate thread > > typo: separate And again, its not "sampling" that is triggered. It is async processing of the queue holding existing samples. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126029865 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126025355 From aph at openjdk.org Wed Jun 4 08:45:21 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 4 Jun 2025 08:45:21 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v2] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 03:30:11 GMT, Liming Liu wrote: >> This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. >> >> 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. >> >> 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. >> >> The performance regressions and improvements were measured with the following microbenchmarks: >> org.openjdk.bench.java.util.TestCRC32.testCRC32Update >> org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate >> >> Ran the following JTReg tests on Ampere1 and did not find problems: >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java > > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Make it be a diagnostic flag src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4355: > 4353: add(buf, buf, 32); > 4354: crc32x(crc, crc, tmp2); > 4355: subs(len, len, 32); What is the point of these changes? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2126035063 From iveresov at openjdk.org Wed Jun 4 08:46:11 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 08:46:11 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v4] In-Reply-To: References: Message-ID: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: More changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25623/files - new: https://git.openjdk.org/jdk/pull/25623/files/f8a9b4a3..5a7b128f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=02-03 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25623.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25623/head:pull/25623 PR: https://git.openjdk.org/jdk/pull/25623 From iveresov at openjdk.org Wed Jun 4 08:46:13 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 08:46:13 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v3] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 00:53:21 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with two additional commits since the last revision: > > - More changes > - Use dedicated OopStorage Ok, testing was clean. Please take another look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2939150820 From aph at openjdk.org Wed Jun 4 08:50:17 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 4 Jun 2025 08:50:17 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v2] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 08:42:58 GMT, Andrew Haley wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Make it be a diagnostic flag > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4355: > >> 4353: add(buf, buf, 32); >> 4354: crc32x(crc, crc, tmp2); >> 4355: subs(len, len, 32); > > What is the point of these changes? To be more precise: converting these adjustments to post-increment operations isn't obviously an improvement on AArch64 generally. How does it help? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2126044000 From mdoerr at openjdk.org Wed Jun 4 08:53:17 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 4 Jun 2025 08:53:17 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v3] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 05:27:06 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment requested by mdoerr Thanks! LGTM. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25549#pullrequestreview-2895870539 From shade at openjdk.org Wed Jun 4 09:01:24 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 09:01:24 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v4] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 08:46:11 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > More changes Right off the bat, before I look at the rest of it: I don't think there is a need to introduce another OopStorage class just for these handles. We already see it would probably require touchups in other code that enumerates OopStorages. So instead, use `VM Global` one? I.e. do: handle = OopHandle(Universe::vm_global(), obj); Also I cannot spot where we clean these. Note that for `OopHandle`-s, you have to explicitly call `.release`, likely in `KlassTrainingData` destructor. ------------- PR Review: https://git.openjdk.org/jdk/pull/25623#pullrequestreview-2895896625 From shade at openjdk.org Wed Jun 4 09:05:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 09:05:16 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Tue, 3 Jun 2025 12:00:03 GMT, Coleen Phillimore wrote: >> Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. >> >> I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `runtime/cds` >> - [x] Linux x86_64 server fastdebug, `tier1` >> - [x] Linux x86_64 server fastdebug, `all` > > And MethodCounters shouldn't be inhertited from Metadata, they're inherited from MetaspaceObj in mainline. We want to avoid virtual function pointers in this type. Before I proceed anywhere with this, I need to understand what @coleenp saw in all this :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25599#issuecomment-2939215148 From mbaesken at openjdk.org Wed Jun 4 09:09:22 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 09:09:22 GMT Subject: Integrated: 8357155: [asan] ZGC does not work (x86_64 and ppc64) In-Reply-To: References: Message-ID: On Fri, 30 May 2025 12:18:46 GMT, Matthias Baesken wrote: > Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). > This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. > It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' > This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . This pull request has now been integrated. Changeset: cd16b689 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/cd16b6896222a623dc99b9e63bb917a9d2980e88 Stats: 9 lines in 2 files changed: 9 ins; 0 del; 0 mod 8357155: [asan] ZGC does not work (x86_64 and ppc64) Co-authored-by: Axel Boldt-Christmas Reviewed-by: mdoerr, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/25549 From mbaesken at openjdk.org Wed Jun 4 09:09:21 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 09:09:21 GMT Subject: RFR: 8357155: [asan] ZGC does not work (x86_64 and ppc64) [v3] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 05:27:06 GMT, Matthias Baesken wrote: >> Many (all?) ZGC related jtreg tests do not work when the JDK is built with address sanitizer asan enabled (configure flag --enable-asan). >> This can be seen on SUSE Linux x86_64 and also on ppc64le , opt binaries were used. >> It has been suggested to do a workaround - 'But I think that simply adapting the zAddress_[...].cpp implementations to always select the largest heap base would go a long way for providing ASAN compatibility.' >> This seems to work nicely on x86_64 and ppc64le, however the zgc related tests still fail on Linux aarch64 (should I exclude this platform from my patch?) . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment requested by mdoerr Hi Axel and Martin, thanks for the reviews ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25549#issuecomment-2939228221 From ayang at openjdk.org Wed Jun 4 09:11:04 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 4 Jun 2025 09:11:04 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v10] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to 25/26 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: - revert-aliases - Merge branch 'master' into pgc-size-policy - merge - merge-fix - merge - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - review - ... and 4 more: https://git.openjdk.org/jdk/compare/ab235000...72645267 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=09 Stats: 4373 lines in 31 files changed: 522 ins; 3452 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From kevinw at openjdk.org Wed Jun 4 09:31:19 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 4 Jun 2025 09:31:19 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v10] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: On Wed, 4 Jun 2025 09:11:04 GMT, Albert Mingkun Yang wrote: >> This patch refines Parallel's sizing strategy to improve overall memory management and performance. >> >> The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. >> >> `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. >> >> GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. >> >> ## Performance evaluation >> >> - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). >> - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). >> - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. >> >> PS: I have opportunistically set the obsolete/expired version to 25/26 for now. I will update them accordingly before merging. >> >> Test: tier1-8 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: > > - revert-aliases > - Merge branch 'master' into pgc-size-policy > - merge > - merge-fix > - merge > - Merge branch 'master' into pgc-size-policy > - Merge branch 'master' into pgc-size-policy > - review > - Merge branch 'master' into pgc-size-policy > - review > - ... and 4 more: https://git.openjdk.org/jdk/compare/ab235000...72645267 Thanks for the aliasmap update, looks good. I think alias sun.gc.policy.boundaryMoved is removed here as it's already redundant, the rest all match with the counter being removed in the change. There is a case for removing those old e.g. 1.4.1 aliases separately, in a future change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25000#issuecomment-2939297086 From iveresov at openjdk.org Wed Jun 4 09:52:17 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 09:52:17 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v4] In-Reply-To: References: Message-ID: <3yChSmr2Gswo4p31Bms_biStRj97VCLkCPLtqIGFdb4=.ddb84aff-523b-4836-b15c-1a16f3bee733@github.com> On Wed, 4 Jun 2025 08:46:11 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > More changes We don't need to release them. KTDs are never destroyed. They just die with the process. As for OopStorage @coleenp wants it. I gives a bit of an advantage that we can remove the handle field from KTD (since again, we don't ever need to free them). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2939363346 From iveresov at openjdk.org Wed Jun 4 09:59:15 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 09:59:15 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v4] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 08:46:11 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > More changes I kind of need to push it today before the fork. Let's try making changes to this minimal. I'm also fine reverting back to before @coleenp suggested OopStorage. And we can address the remaining concerns later. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2939386000 From shade at openjdk.org Wed Jun 4 09:59:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 09:59:16 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v4] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 09:56:20 GMT, Igor Veresov wrote: > I kind of need to push it today before the fork. Let's try making changes to this minimal. I'm also fine reverting back to before @coleenp suggested OopStorage. And we can address the remaining concerns later. Yeah, let's do OopStorage rewrite as the followup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2939387077 From iveresov at openjdk.org Wed Jun 4 10:14:31 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 10:14:31 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v5] In-Reply-To: References: Message-ID: <1QNHiJnC7fE8K_KX5gK9VP-OsKNPAqyJfxFjOyfJyP4=.40f70cdb-11d9-4eec-9b25-ec2714fad601@github.com> > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Undo OopStorage changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25623/files - new: https://git.openjdk.org/jdk/pull/25623/files/5a7b128f..a5693d69 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25623&range=03-04 Stats: 33 lines in 5 files changed: 8 ins; 21 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25623.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25623/head:pull/25623 PR: https://git.openjdk.org/jdk/pull/25623 From iveresov at openjdk.org Wed Jun 4 10:14:32 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 10:14:32 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v4] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 08:46:11 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > More changes Ok, I reverted this to before the OopStorage changes. And file https://bugs.openjdk.org/browse/JDK-8358580 to rethink it later. @coleenp are you ok with that? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25623#issuecomment-2939433502 From jbechberger at openjdk.org Wed Jun 4 11:13:16 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:13:16 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v44] In-Reply-To: References: Message-ID: <35LXUV5UP0dcnU2ImfP7ny2SyPmJBTYhRT6JbADqWA4=.22d4360e-639c-4e65-86a3-62aad45a2606@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Add error message on signal handler install failure - Fix signal handler synchronization ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/55c30aef..4a258e96 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=43 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=42-43 Stats: 71 lines in 2 files changed: 44 ins; 19 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 11:13:17 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:13:17 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 08:40:34 GMT, Markus Gr?nlund wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 606: >> >>> 604: void JfrCPUTimeThreadSampler::init_timers() { >>> 605: // install sig handler for sig >>> 606: PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal); >> >> SIGPROF is also used by external profilers. Need to check if SIGPROF handler is already installed and warn user. > > This is *very* important to have a robust failure mechanism when existing handlers are already installed. Why? JFR can be turned on dynamically from the outside, at any time, during runtime. A lot of agents could have installed their handlers by then. > > Please describe how you intend to handle the case where someone starts JFR late during runtime and the signal handler cannot be installed. I added a log_error to tell the user >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 128: >> >>> 126: static void send_lost_event(const JfrTicks& time, traceid tid, s4 lost_samples); >>> 127: >>> 128: // Trigger sampling while a thread is not in a safepoint, from a seperate thread >> >> typo: separate > > And again, its not "sampling" that is triggered. It is async processing of the queue holding existing samples. I removed the comment, as the method name itself is pretty self-explanatory. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126330036 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126332382 From jbechberger at openjdk.org Wed Jun 4 11:18:52 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:18:52 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 127 commits: - Merge branch 'master' into parttimenerd_cooperative_cpu_time_sampler - Add error message on signal handler install failure - Fix signal handler synchronization - Improve - Rename autoadapt - Make process_cpu_time_request private and move up - Reorder condition - Tiny refactoring - Restrict threads for which timers are created - Fix tiny mistake - ... and 117 more: https://git.openjdk.org/jdk/compare/7838321b...4fd4b673 ------------- Changes: https://git.openjdk.org/jdk/pull/25302/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=44 Stats: 2308 lines in 39 files changed: 2164 ins; 128 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 11:28:51 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:28:51 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve error message ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/4fd4b673..8fe07614 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=45 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=44-45 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From apangin at openjdk.org Wed Jun 4 11:28:51 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 11:28:51 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 05:26:42 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 161: >> >>> 159: return 0; >>> 160: } >>> 161: return os::active_processor_count() * 1000000000.0 / rate; >> >> If sampling period is configured as an absolute number in milliseconds, this value must be passed as is. >> Double conversion via `Runtime.availableProcessors()` / `active_processor_count()` is unobvious and error-prone. First, because of asymmetry: e.g. `Runtime.availableProcessors()` may be redefined by an agent so that its value is not aligned with `active_processor_count()`. Second, because number of available processors may change at runtime, e.g., by adjusting cgroup quotas. > > Is this something for a later PR? I'm OK with fixing this separately. >> src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 558: >> >>> 556: void JfrThreadLocal::set_cpu_timer(timer_t* timer) { >>> 557: if (_cpu_timer == nullptr) { >>> 558: _cpu_timer = JfrCHeapObj::new_array(1); >> >> `timer_t` is a primitive type, at most one machine word. Why extra indirection and allocation? > > @mgronlun wanted this indirection to move it abstract from implementation details I don't see how it is an abstraction when the pointer still has concrete `timer_t` type. All POSIX timer functions accept `timer_t` rather than `timer_t*`. This is not a big issue, though, just a minor inefficiency. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126360330 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126304082 From apangin at openjdk.org Wed Jun 4 11:28:52 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 11:28:52 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v43] In-Reply-To: <-BGoOClpsfsd4Q8Wq-H57L3tIvoaLGauYtRBEDPO-_w=.97e25e4f-879d-45f6-bd00-ad53e2463a8d@github.com> References: <-BGoOClpsfsd4Q8Wq-H57L3tIvoaLGauYtRBEDPO-_w=.97e25e4f-879d-45f6-bd00-ad53e2463a8d@github.com> Message-ID: On Wed, 4 Jun 2025 07:00:51 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 238: > 236: tl->cpu_time_jfr_queue().resize_for_period(_current_sampling_period_ns / 1000000); > 237: timer_t timerid; > 238: if (create_timer_for_thread(thread, timerid)) { Timer creation failure is not an impossible situation, we should somehow let user know that not all threads are being profiled but without flooding in logs. One warning per profiling session may be a good compromise. You can verify failure condition by setting low `ulimit -i`. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 492: > 490: assert(_sampler != nullptr, "invariant"); > 491: if (info->si_signo != SIGPROF) { > 492: // not the signal we are interested in No, I meant checking `si_code`. `si_signo` will always be the right one. And this check should come first, before any assertions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126338737 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126316533 From apangin at openjdk.org Wed Jun 4 11:28:54 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 11:28:54 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 11:18:52 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 127 commits: > > - Merge branch 'master' into parttimenerd_cooperative_cpu_time_sampler > - Add error message on signal handler install failure > - Fix signal handler synchronization > - Improve > - Rename autoadapt > - Make process_cpu_time_request private and move up > - Reorder condition > - Tiny refactoring > - Restrict threads for which timers are created > - Fix tiny mistake > - ... and 117 more: https://git.openjdk.org/jdk/compare/7838321b...4fd4b673 src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 611: > 609: // increment the count of active signal handlers > 610: u4 old_value = Atomic::fetch_then_add(&_active_signal_handlers, (u4)1, memory_order_acq_rel); > 611: if ((old_value & STOP_SIGNAL_BIT) != 0) { Combining stop signal with a counter is nice, you can then use `Atomic::cmpxchg` to avoid incrementing counter when the stop bit is set. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126354062 From mgronlun at openjdk.org Wed Jun 4 11:28:54 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 11:28:54 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 11:18:52 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 127 commits: > > - Merge branch 'master' into parttimenerd_cooperative_cpu_time_sampler > - Add error message on signal handler install failure > - Fix signal handler synchronization > - Improve > - Rename autoadapt > - Make process_cpu_time_request private and move up > - Reorder condition > - Tiny refactoring > - Restrict threads for which timers are created > - Fix tiny mistake > - ... and 117 more: https://git.openjdk.org/jdk/compare/7838321b...4fd4b673 src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 647: > 645: // install sig handler for sig > 646: if ((s8)PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == -1) { > 647: log_error(jfr)("Failed to install signal handler for CPU thread sampling, possibly because another profiler is active: %s", os::strerror(os::get_last_error())); That we are using a signal handler to provide the user with CPU time information is an implementation detail. Its good to provide an error message, but I think it should reflect back on something the user is expecting. Perhaps add a line that says something along the lines of "CPUTimeSample events will not be recorded." ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126357345 From jbechberger at openjdk.org Wed Jun 4 11:28:54 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:28:54 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: Message-ID: <1vlVzhpxsamhKyaKX4ixcG-JZj4Qxgc0Au3mEnjs_So=.91d8a5a1-7b1b-46c6-991c-a5c61c77e39e@github.com> On Wed, 4 Jun 2025 11:23:57 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 127 commits: >> >> - Merge branch 'master' into parttimenerd_cooperative_cpu_time_sampler >> - Add error message on signal handler install failure >> - Fix signal handler synchronization >> - Improve >> - Rename autoadapt >> - Make process_cpu_time_request private and move up >> - Reorder condition >> - Tiny refactoring >> - Restrict threads for which timers are created >> - Fix tiny mistake >> - ... and 117 more: https://git.openjdk.org/jdk/compare/7838321b...4fd4b673 > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 647: > >> 645: // install sig handler for sig >> 646: if ((s8)PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == -1) { >> 647: log_error(jfr)("Failed to install signal handler for CPU thread sampling, possibly because another profiler is active: %s", os::strerror(os::get_last_error())); > > That we are using a signal handler to provide the user with CPU time information is an implementation detail. Its good to provide an error message, but I think it should reflect back on something the user is expecting. > > Perhaps add a line that says something along the lines of "CPUTimeSample events will not be recorded." Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126360801 From coleenp at openjdk.org Wed Jun 4 11:36:19 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 4 Jun 2025 11:36:19 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v5] In-Reply-To: <1QNHiJnC7fE8K_KX5gK9VP-OsKNPAqyJfxFjOyfJyP4=.40f70cdb-11d9-4eec-9b25-ec2714fad601@github.com> References: <1QNHiJnC7fE8K_KX5gK9VP-OsKNPAqyJfxFjOyfJyP4=.40f70cdb-11d9-4eec-9b25-ec2714fad601@github.com> Message-ID: On Wed, 4 Jun 2025 10:14:31 GMT, Igor Veresov wrote: >> Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > Undo OopStorage changes I'm fine with this and the follow-up issue. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25623#pullrequestreview-2896371220 From coleenp at openjdk.org Wed Jun 4 11:36:21 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 4 Jun 2025 11:36:21 GMT Subject: RFR: 8358003: KlassTrainingData initializer reads garbage holder [v2] In-Reply-To: References: <4ne8DsOBEMC2jSdOBI4l_33Jrs0CXHEKpdrLlBB-2uM=.52428bbb-6abc-4c33-85e7-6aa424c8b4f7@github.com> Message-ID: On Wed, 4 Jun 2025 00:53:56 GMT, Igor Veresov wrote: >> Are there any advantages? > > Ok, transitioned to OopStrage. Please take a look if correctly. I'll be back when the testing is done. The advantage of OopStorage is that jni handles aren't trusted because they come from outside jni calls so have some safefetch code, but OopStorage are trusted so presumably faster. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25623#discussion_r2126372510 From jbechberger at openjdk.org Wed Jun 4 11:37:34 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:37:34 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v43] In-Reply-To: References: <-BGoOClpsfsd4Q8Wq-H57L3tIvoaLGauYtRBEDPO-_w=.97e25e4f-879d-45f6-bd00-ad53e2463a8d@github.com> Message-ID: <3tt6-HjldmHpRIKHFqWHaLcC7FRZLwiCZl__j7Ht7Gw=.a14d3aa8-4780-49e2-b9ec-d24a828a1948@github.com> On Wed, 4 Jun 2025 11:13:30 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 238: > >> 236: tl->cpu_time_jfr_queue().resize_for_period(_current_sampling_period_ns / 1000000); >> 237: timer_t timerid; >> 238: if (create_timer_for_thread(thread, timerid)) { > > Timer creation failure is not an impossible situation, we should somehow let user know that not all threads are being profiled but without flooding in logs. One warning per profiling session may be a good compromise. > You can verify failure condition by setting low `ulimit -i`. I added a "Failed to create timer for a thread" warning ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126373721 From jbechberger at openjdk.org Wed Jun 4 11:37:37 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:37:37 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 11:21:57 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 127 commits: >> >> - Merge branch 'master' into parttimenerd_cooperative_cpu_time_sampler >> - Add error message on signal handler install failure >> - Fix signal handler synchronization >> - Improve >> - Rename autoadapt >> - Make process_cpu_time_request private and move up >> - Reorder condition >> - Tiny refactoring >> - Restrict threads for which timers are created >> - Fix tiny mistake >> - ... and 117 more: https://git.openjdk.org/jdk/compare/7838321b...4fd4b673 > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 611: > >> 609: // increment the count of active signal handlers >> 610: u4 old_value = Atomic::fetch_then_add(&_active_signal_handlers, (u4)1, memory_order_acq_rel); >> 611: if ((old_value & STOP_SIGNAL_BIT) != 0) { > > Combining stop signal with a counter is nice, you can then use `Atomic::cmpxchg` to avoid incrementing counter when the stop bit is set. I don't see how `Atomic::cmpxchg` would make the code easier. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126375482 From dtabata at openjdk.org Wed Jun 4 11:43:31 2025 From: dtabata at openjdk.org (Daishi Tabata) Date: Wed, 4 Jun 2025 11:43:31 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 11:28:51 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve error message https://github.com/openjdk/jdk/pull/25302/commits/a419dabab213e78a2ff7f3c62cd4af72a0fdabed Since the implementation has changed from Loss to Lost, the JEP document needs to be changed back to the original, Lost. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2939698219 From jbechberger at openjdk.org Wed Jun 4 11:43:32 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:43:32 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v45] In-Reply-To: References: Message-ID: <6e2kCVaBWwL4UY_zXxuwRDYQKksbEo_uaRH7P8gBDJU=.f52500af-d413-4b2e-bc19-32d2248aa48e@github.com> On Wed, 4 Jun 2025 11:34:44 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 611: >> >>> 609: // increment the count of active signal handlers >>> 610: u4 old_value = Atomic::fetch_then_add(&_active_signal_handlers, (u4)1, memory_order_acq_rel); >>> 611: if ((old_value & STOP_SIGNAL_BIT) != 0) { >> >> Combining stop signal with a counter is nice, you can then use `Atomic::cmpxchg` to avoid incrementing counter when the stop bit is set. > > I don't see how `Atomic::cmpxchg` would make the code easier. With my current code, I avoid having a loop, and in the fast path, I only have one atomic instruction. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126383576 From coleenp at openjdk.org Wed Jun 4 11:46:15 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 4 Jun 2025 11:46:15 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: <3vcIHlpqLmTiXCOB3vLT-zxygE07PKsh-ni9dcPMENM=.73ee4382-b2a4-450c-920f-a1ce3d4ff87b@github.com> On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` My repo was two weeks old so I didn't see this change to give MethodCounters a vptr, and don't know why. At worst the backpointer to Method* in MethodCounters is redundant with the Method* that you're creating the oop_references for, but it shouldn't create two oops. ie, md == ((MethodCounter*)m)->method(); But maybe that's not the case here. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25599#pullrequestreview-2896407247 From jbechberger at openjdk.org Wed Jun 4 11:49:33 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 11:49:33 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 11:28:51 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve error message > [a419dab](https://github.com/openjdk/jdk/commit/a419dabab213e78a2ff7f3c62cd4af72a0fdabed) > Since the implementation has changed from Loss to Lost, the JEP document needs to be changed back to the original, Lost. Good catch, I updated the JEP. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2939723399 From jbechberger at openjdk.org Wed Jun 4 12:05:50 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 12:05:50 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v47] In-Reply-To: References: Message-ID: <7XHQamQvo__d4VCHVNQQqwNEmPLoKh8wtpES1a3ZRDg=.2bc3d95c-c00d-4487-90e2-2341a8da9173@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/8fe07614..fe53990d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=46 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=45-46 Stats: 11 lines in 1 file changed: 8 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Wed Jun 4 12:05:50 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 12:05:50 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v46] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 11:28:51 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve error message I am approving this PR for the following reasons: 1. We have reached a state that is "good enough" - I no longer see any fundamental design issues that can not be handled by follow-up bug fixes. 2. There are still many vague aspects included with this PR, as many has already pointed out, mostly related to the memory model and thread interactions - all those can, and should, be clarified, explained and exacted post-integration. 3. The feature as a whole is experimental and turned off by default. 4. Today is the penultimate day before JDK 25 cutoff. To give the feature a fair chance for making JDK25, it needs approval now. Thanks a lot Johannes and all involved for your hard work getting this feature ready. Many thanks Markus ------------- Marked as reviewed by mgronlun (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2896467191 From mbaesken at openjdk.org Wed Jun 4 12:08:18 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 12:08:18 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <6HSruHtZNPOZJp4vNFnwMns6-_rP_MEHtnnvAP7S5QU=.e91023a2-089c-4541-86a5-ae8d4adeb99d@github.com> Message-ID: On Wed, 4 Jun 2025 06:26:02 GMT, Matthias Baesken wrote: > In ASAN built JDK, some gtests and some other JTREG tests in runtime/ErrorHandling also fail. btw I did not check ALL gtests but the HS `:tier1` gtests work for me now on Linux x86_64. But make sure the very recent change https://bugs.openjdk.org/browse/JDK-8357155 8357155: [asan] ZGC does not work is included. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2939781272 From rehn at openjdk.org Wed Jun 4 12:08:25 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 4 Jun 2025 12:08:25 GMT Subject: RFR: 8356159: RISC-V: Add Zabha [v12] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 13:22:04 GMT, Feilong Jiang wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 21 additional commits since the last revision: >> >> - Merge branch 'master' into 8356159 >> - Set ins cost to 2xVOLA for cmpxchg >> - Merge branch 'master' into 8356159 >> - Merge branch 'master' into 8356159 >> - ins cost fixes, print fixes >> - Merge branch 'master' into 8356159 >> - Reg limits fixed >> - Merge branch 'master' into 8356159 >> - Fixed reg selection >> - More indention >> - ... and 11 more: https://git.openjdk.org/jdk/compare/66a7f51f...cc3b8ff7 > > Looks good! Thanks @feilongjiang @RealFYang ------------- PR Comment: https://git.openjdk.org/jdk/pull/25252#issuecomment-2939782354 From jbechberger at openjdk.org Wed Jun 4 12:10:17 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 12:10:17 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v48] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix timer creation warning ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/fe53990d..8d545e74 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=47 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=46-47 Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mbaesken at openjdk.org Wed Jun 4 12:18:20 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 12:18:20 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan The test AOTCodeCompressedOopsTest.java has the memory error mentioned above fixed now with recent changes , but shows another issue runtime/cds/appcds/aotCode/AOTCodeCompressedOopsTest.java --------------------------------------------------------------- java.lang.RuntimeException: Pattern "narrow_oop_base = 0x(\\d+), narrow_oop_shift = (\\d)" not found in the output at AOTCodeCompressedOopsTest$Tester.checkExecution(AOTCodeCompressedOopsTest.java:184) at jdk.test.lib.cds.CDSAppTester.executeAndCheck(CDSAppTester.java:221) at jdk.test.lib.cds.CDSAppTester.productionRun(CDSAppTester.java:427) at jdk.test.lib.cds.CDSAppTester.productionRun(CDSAppTester.java:392) at AOTCodeCompressedOopsTest.main(AOTCodeCompressedOopsTest.java:58) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) at java.base/java.lang.Thread.run(Thread.java:1474) Maybe we should ask an AOT expert about this, not sure what that means. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2939805054 From jbechberger at openjdk.org Wed Jun 4 12:23:13 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 12:23:13 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v49] In-Reply-To: References: Message-ID: <4aLbfK7e6pncU0QwXORueBxt8WEOz5KYO1pKnpjFOC0=.cf78fb29-a30f-4084-bee7-76c1e6e81f31@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/8d545e74..fbaf1da6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=48 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=47-48 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From apangin at openjdk.org Wed Jun 4 12:38:21 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 12:38:21 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v48] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 12:10:17 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix timer creation warning src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 491: > 489: > 490: void JfrCPUTimeThreadSampling::handle_timer_signal(siginfo_t* info, void* context) { > 491: if (info->si_code != SIGPROF) { The correct check is `if (info->si_code != SI_TIMER)` src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 652: > 650: bool JfrCPUSamplerThread::init_timers() { > 651: // install sig handler for sig > 652: if ((s8)PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == -1) { Comparing return value to `(void*)-1` would be cleaner. But the main problem is that it only checks for `sigaction` failure (which normally never happens), however, we should also check if there was a custom signal handler set _before_ installing our own handler, i.e. old handler is not SIG_IGN or SIG_DFL or `handle_timer_signal`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126447823 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126488937 From jbechberger at openjdk.org Wed Jun 4 12:47:36 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 12:47:36 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v48] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 12:33:45 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix timer creation warning > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 652: > >> 650: bool JfrCPUSamplerThread::init_timers() { >> 651: // install sig handler for sig >> 652: if ((s8)PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == -1) { > > Comparing return value to `(void*)-1` would be cleaner. > But the main problem is that it only checks for `sigaction` failure (which normally never happens), however, we should also check if there was a custom signal handler set _before_ installing our own handler, i.e. old handler is not SIG_IGN or SIG_DFL or `handle_timer_signal`. Using `sigaction(SIG, NULL, &sa)` ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126510947 From rehn at openjdk.org Wed Jun 4 12:50:15 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 4 Jun 2025 12:50:15 GMT Subject: Integrated: 8356159: RISC-V: Add Zabha In-Reply-To: References: Message-ID: On Thu, 15 May 2025 14:08:48 GMT, Robbin Ehn wrote: > Hi, please consider. > > This adds the byte and halfword atomic memory operations (Zabha) - https://github.com/riscv/riscv-zabha. > All amo-instructions, except load-reserve and store-conditional, can also be performed on natural aligned half-words and bytes. (i.e. the extension do not add lr.h/b or sc.h/b) This includes amocas if zacas extension is present. > > The majority of this patch is to support amocas.h/b. We are now starting to really feel the pain of all these extensions, as CAS:ing 16/8-bits can now be done in three different ways: > - lr.w/sc.w 'narrow' CAS (no extension) > - amocas.w 'narrow' CAS (Zacas) > - amocas.h/b (Zacas + Zabha) > > There is no hwprobe support yet. > > Ran t1-3 with Zacas+Zabha and t1 without Zabha in qemu. > > Thanks, Robbin This pull request has now been integrated. Changeset: dc961609 Author: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/dc961609f84a38164d10852cb92c005c3eb077e4 Stats: 824 lines in 6 files changed: 563 ins; 64 del; 197 mod 8356159: RISC-V: Add Zabha Reviewed-by: fyang, fjiang ------------- PR: https://git.openjdk.org/jdk/pull/25252 From rvansa at openjdk.org Wed Jun 4 12:50:35 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 4 Jun 2025 12:50:35 GMT Subject: RFR: 8352075: Perf regression accessing fields [v15] In-Reply-To: <5gclUhzEQCai7QGUBDA16OcIrQcmesMGR1pJd2Hbgbw=.79a0d71a-a246-4a84-9794-43f7ef738b09@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <5gclUhzEQCai7QGUBDA16OcIrQcmesMGR1pJd2Hbgbw=.79a0d71a-a246-4a84-9794-43f7ef738b09@github.com> Message-ID: On Fri, 30 May 2025 21:14:53 GMT, John R Rose wrote: >> Radim Vansa has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. > >> I like the idea of mapping each element in the table as raw bits, though handling of access to the end of the array would be a bit inconvenient (or we would have to allocate a few extra bytes). > > The code snippet I shared above shows a better way: You load a full 8 (or 4) bytes where the END (not the START) of the word lines up with the LAST (not FIRST) byte. Then you will never run past the end of the array! So, fine, but what about the start of the array? Well, it's inside an `Array` object, which has a length header, which is guaranteed to be safe to load (under a cast or bytewise or whatever). Problem solved. The only thing to avoid is to load an 8-byte word when the packed word size is 1..5 bytes; then you load a 4-byte word. You can load both components at once, and then use a configurable shift (from one machine word) to separate them. This is why I say it saves a half-byte on average. > > These tweaky ideas have three effects: They probably make the code a little simpler (or at least no worse), they reduce the number of memory operations to query a packed array, and they probably use fewer ALU instructions overall. They are certainly worth considering for the general-purpose "searchable packed array" I am envisioning; they are optional for this particular bug, viewed in isolation. > >> I've changed the algorithm to use unsigned integers; in fact I find a bit annoying that most of the indices used throughout the related code are signed. > > Yes, it annoys me also. It's playing with fire (or walking the firepit). > >> I've also added a test generating class with a different number of fields, though running it through the full range of fields (0-65535, though in practice the upper bound is rather 26k) would be excessive; even now it takes more than a minute on my machine. Also, I realize that varying the number of fields does not result in full coverage of possible stream sizes; per-field records have probably rather uniform lengths. > > Yeah, a gtest on the binary search would cover most of those issues, faster and cleaner. Then loading many gigantic classfiles will be unnecessary. Just a few classfiles at several scales, probably, and thorough gtest-level unit testing, gets a better result in less time. As I said above, I'm willing to put off some of the refactoring, given that it should cover other, prior occurrences of binary search (so it's got a larger scope than this bug). > >> @rose00 OK, so I have refactored out the PackedTable that now h... @rose00 Hi, would you be OK with the current implementation? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2939915008 From jbechberger at openjdk.org Wed Jun 4 12:56:22 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 12:56:22 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: Message-ID: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Fix build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/fbaf1da6..e4558a6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=49 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=48-49 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 13:07:15 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 13:07:15 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v48] In-Reply-To: References: Message-ID: <-JmnqMbD8vGy_dVeVUv59WrjuCavWV3F3w9HMTxhAQM=.2c079574-0729-4c70-af86-946a3204f7b6@github.com> On Wed, 4 Jun 2025 12:44:53 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 652: >> >>> 650: bool JfrCPUSamplerThread::init_timers() { >>> 651: // install sig handler for sig >>> 652: if ((s8)PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == -1) { >> >> Comparing return value to `(void*)-1` would be cleaner. >> But the main problem is that it only checks for `sigaction` failure (which normally never happens), however, we should also check if there was a custom signal handler set _before_ installing our own handler, i.e. old handler is not SIG_IGN or SIG_DFL or `handle_timer_signal`. > > Using `sigaction(SIG, NULL, &sa)` ? I'm currently implementing the check against SIG_IGN and SIG_DFL, as `handle_timer_signal` should never occur. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126551278 From iwalulya at openjdk.org Wed Jun 4 13:51:57 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 4 Jun 2025 13:51:57 GMT Subject: RFR: 8358294: Remove unnecessary GenAlignment In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 08:36:08 GMT, Albert Mingkun Yang wrote: > Simple replacement of `GenAlignment` with `SpaceAlignment`, because they always have the same value. Removing the former to reduce complexity. > > Test: tier1-3 LGTM! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25577#pullrequestreview-2896822028 From mdoerr at openjdk.org Wed Jun 4 14:01:16 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 4 Jun 2025 14:01:16 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 12:56:22 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix build I've looked over it and couldn't spot any critical issue. I think it's good enough for an experimental feature if we do further cleanups and improvements later. What I'd like to see as a follow-up is a review of the usage of `Atomic` functions. I've never seen so many of them in such a density. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2896853345 From iveresov at openjdk.org Wed Jun 4 14:11:00 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 14:11:00 GMT Subject: Integrated: 8358003: KlassTrainingData initializer reads garbage holder In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 17:36:13 GMT, Igor Veresov wrote: > Simplify KlassTrainingData constructor. The lines in question come from the old pre-CDS world. They are not needed anymore. This pull request has now been integrated. Changeset: ae1892fb Author: Igor Veresov URL: https://git.openjdk.org/jdk/commit/ae1892fb0fb6b7646f9ca60067d6945ccea7f888 Stats: 18 lines in 1 file changed: 0 ins; 13 del; 5 mod 8358003: KlassTrainingData initializer reads garbage holder Reviewed-by: coleenp, shade, vlivanov ------------- PR: https://git.openjdk.org/jdk/pull/25623 From shade at openjdk.org Wed Jun 4 14:12:51 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 14:12:51 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` OK, phew. I thought I am not seeing some huge gap here. Thanks! I think we are ready to integrate this. Just checking if @veresov is also okay with it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25599#issuecomment-2940192413 From pchilanomate at openjdk.org Wed Jun 4 14:15:53 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 14:15:53 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 12:56:22 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix build I took a look at this. I only found one issue that needs fixing before integration and then a few comments. Thanks. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 59: > 57: Thread* raw_thread = Thread::current_or_null_safe(); > 58: if (raw_thread == nullptr) { > 59: // probably while shutting down Do you remember which test fail because of this? It would be interesting to know, since I don?t see how it could be null here. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 330: > 328: void JfrCPUSamplerThread::stackwalk_threads_in_native() { > 329: ResourceMark rm; > 330: MutexLocker tlock(Threads_lock); What exactly are we guarding against by holding the `Threads_lock`? Seems `ThreadsListHandle` should be enough. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 430: > 428: void JfrCPUTimeThreadSampling::create_sampler(double rate, bool auto_adapt) { > 429: assert(_sampler == nullptr, "invariant"); > 430: _sampler = new JfrCPUSamplerThread(rate, auto_adapt); If we start a recording on an already running process we have a race here where a new thread can create and set its timer before we call init_timers() where the signal handler is installed. In that case the program will terminate with message ?Profiling timer expired" (default action for SIGPROF). It can be easily reproduced by adding a delay here and starting a recording on a simple test that just creates new threads. We need to add some extra check in create_timer_for_thread() or install the signal handler earlier. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 584: > 582: sev.sigev_notify = SIGEV_THREAD_ID; > 583: sev.sigev_signo = SIG; > 584: sev.sigev_value.sival_ptr = &t; Why setting the address of `t` which is a local variable here? ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2896813404 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126661443 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126670024 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126651264 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126666505 From pchilanomate at openjdk.org Wed Jun 4 14:15:54 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 14:15:54 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:13:54 GMT, Markus Gr?nlund wrote: >> So I don't read the ` tl->has_cpu_time_jfr_requests()` twice on the hot-path > > Ok, for now. We should try to come up with a better split. If there are sample requests then we are already in the slow path so loading again `has_cpu_time_jfr_requests()` later won't change anything. My suggestion would be to avoid passing this boolean around. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126692802 From jbechberger at openjdk.org Wed Jun 4 14:15:48 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:15:48 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v51] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Check if signal handler is already installed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/e4558a6e..762da321 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=50 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=49-50 Stats: 14 lines in 3 files changed: 12 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 14:15:55 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:15:55 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v48] In-Reply-To: <-JmnqMbD8vGy_dVeVUv59WrjuCavWV3F3w9HMTxhAQM=.2c079574-0729-4c70-af86-946a3204f7b6@github.com> References: <-JmnqMbD8vGy_dVeVUv59WrjuCavWV3F3w9HMTxhAQM=.2c079574-0729-4c70-af86-946a3204f7b6@github.com> Message-ID: On Wed, 4 Jun 2025 13:04:28 GMT, Johannes Bechberger wrote: >> Using `sigaction(SIG, NULL, &sa)` ? > > I'm currently implementing the check against SIG_IGN and SIG_DFL, as `handle_timer_signal` should never occur. I implemented what you want. This prevents confusion of users. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126701277 From mbaesken at openjdk.org Wed Jun 4 14:17:52 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 4 Jun 2025 14:17:52 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <6lt1R3T4xH4FQGEDE7GzuThv4MaGmGynK-KG5o17-Q8=.164c5a02-d5f4-4791-988a-5e29fa6463bc@github.com> On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan Had a look at HS tier3 tests too and the whole runtime/signal/TestSig* tests fail with asan like this e.g. runtime/signal/TestSigalrm.java stdout: []; stderr: [==3863397==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD. So maybe we should mark them too ? Seems to be the same kind of issue as in the HS tier1 jsig related tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2940211436 From jbechberger at openjdk.org Wed Jun 4 14:21:07 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:21:07 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 13:51:31 GMT, Patricio Chilano Mateo wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix build > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 59: > >> 57: Thread* raw_thread = Thread::current_or_null_safe(); >> 58: if (raw_thread == nullptr) { >> 59: // probably while shutting down > > Do you remember which test fail because of this? It would be interesting to know, since I don?t see how it could be null here. I just ran the renaissance benchmark with https://github.com/parttimenerd/basic-profiler-tests for a few hours. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126728665 From jbechberger at openjdk.org Wed Jun 4 14:27:16 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:27:16 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 13:55:06 GMT, Patricio Chilano Mateo wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix build > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 330: > >> 328: void JfrCPUSamplerThread::stackwalk_threads_in_native() { >> 329: ResourceMark rm; >> 330: MutexLocker tlock(Threads_lock); > > What exactly are we guarding against by holding the `Threads_lock`? Seems `ThreadsListHandle` should be enough. You're right. > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 584: > >> 582: sev.sigev_notify = SIGEV_THREAD_ID; >> 583: sev.sigev_signo = SIG; >> 584: sev.sigev_value.sival_ptr = &t; > > Why setting the address of `t` which is a local variable here? Because this is how the API works. You store the location where the timer should be stored. See https://www.man7.org/linux/man-pages/man2/timer_create.2.html for more information. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126741491 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126735519 From jbechberger at openjdk.org Wed Jun 4 14:31:08 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:31:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v29] In-Reply-To: References: Message-ID: <8K2UKh2c90_9WYibMs0LoKZr5GNFAGZkINV11POsvac=.8150ba38-e33e-4780-bb08-2783640b6a3a@github.com> On Wed, 4 Jun 2025 14:04:51 GMT, Patricio Chilano Mateo wrote: >> Ok, for now. We should try to come up with a better split. > > If there are sample requests then we are already in the slow path so loading again `has_cpu_time_jfr_requests()` later won't change anything. My suggestion would be to avoid passing this boolean around. I removed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126751794 From pchilanomate at openjdk.org Wed Jun 4 14:45:21 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 14:45:21 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:21:31 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 584: >> >>> 582: sev.sigev_notify = SIGEV_THREAD_ID; >>> 583: sev.sigev_signo = SIG; >>> 584: sev.sigev_value.sival_ptr = &t; >> >> Why setting the address of `t` which is a local variable here? > > Because this is how the API works. You store the location where the timer should be stored. > > See https://www.man7.org/linux/man-pages/man2/timer_create.2.html for more information. Sorry where does it say that? I think you are looking at the example in that page which makes use of sival_ptr in the signal handler. In that example reading from that timer address in the handler is valid but for us we would be accessing invalid memory. Plus we are not really reading it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126787035 From jbechberger at openjdk.org Wed Jun 4 14:45:15 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:45:15 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 13:47:01 GMT, Patricio Chilano Mateo wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix build > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 430: > >> 428: void JfrCPUTimeThreadSampling::create_sampler(double rate, bool auto_adapt) { >> 429: assert(_sampler == nullptr, "invariant"); >> 430: _sampler = new JfrCPUSamplerThread(rate, auto_adapt); > > If we start a recording on an already running process we have a race here where a new thread can create and set its timer before we call init_timers() where the signal handler is installed. In that case the program will terminate with message ?Profiling timer expired" (default action for SIGPROF). It can be easily reproduced by adding a delay here and starting a recording on a simple test that just creates new threads. We need to add some extra check in create_timer_for_thread() or install the signal handler earlier. I created a `has_timer` flag that is checked by `create_timer_for_thread()` before creating timers and set by `init_timers`. Is this what you envisioned? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126783367 From jbechberger at openjdk.org Wed Jun 4 14:49:09 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:49:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:41:24 GMT, Patricio Chilano Mateo wrote: >> Because this is how the API works. You store the location where the timer should be stored. >> >> See https://www.man7.org/linux/man-pages/man2/timer_create.2.html for more information. > > Sorry where does it say that? I think you are looking at the example in that page which makes use of sival_ptr in the signal handler. In that example reading from that timer address in the handler is valid but for us we would be accessing invalid memory. Plus we are not really reading it. Why would we be accessing invalid memory? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126798263 From jbechberger at openjdk.org Wed Jun 4 14:59:33 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:59:33 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v52] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Small changes Suggested by @pchilano ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/762da321..20b8db28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=51 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=50-51 Stats: 21 lines in 4 files changed: 6 ins; 4 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 14:59:33 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:59:33 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:46:27 GMT, Johannes Bechberger wrote: >> Sorry where does it say that? I think you are looking at the example in that page which makes use of sival_ptr in the signal handler. In that example reading from that timer address in the handler is valid but for us we would be accessing invalid memory. Plus we are not really reading it. > > Why would we be accessing invalid memory? Which differences to the example code are you seeing? sev.sigev_notify = SIGEV_SIGNAL; sev.sigev_signo = SIG; sev.sigev_value.sival_ptr = &timerid; if (timer_create(CLOCKID, &sev, &timerid) == -1) errExit("timer_create"); printf("timer ID is %#jx\n", (uintmax_t) timerid); /* Start the timer. */ freq_nanosecs = atoll(argv[2]); its.it_value.tv_sec = freq_nanosecs / 1000000000; its.it_value.tv_nsec = freq_nanosecs % 1000000000; its.it_interval.tv_sec = its.it_value.tv_sec; its.it_interval.tv_nsec = its.it_value.tv_nsec; Is similar to: ((int*)&sev.sigev_notify)[1] = thread->osthread()->thread_id(); clockid_t clock; int err = pthread_getcpuclockid(thread->osthread()->pthread_id(), &clock); if (err != 0) { log_error(jfr)("Failed to get clock for thread sampling: %s", os::strerror(err)); return false; } if (timer_create(clock, &sev, &t) < 0) { return false; } int64_t period = get_sampling_period(); if (period != 0) { set_timer_time(t, period); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126810984 From pchilanomate at openjdk.org Wed Jun 4 14:59:33 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 14:59:33 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:52:04 GMT, Johannes Bechberger wrote: >> Why would we be accessing invalid memory? > > Which differences to the example code are you seeing? > > > sev.sigev_notify = SIGEV_SIGNAL; > sev.sigev_signo = SIG; > sev.sigev_value.sival_ptr = &timerid; > if (timer_create(CLOCKID, &sev, &timerid) == -1) > errExit("timer_create"); > > printf("timer ID is %#jx\n", (uintmax_t) timerid); > > /* Start the timer. */ > > freq_nanosecs = atoll(argv[2]); > its.it_value.tv_sec = freq_nanosecs / 1000000000; > its.it_value.tv_nsec = freq_nanosecs % 1000000000; > its.it_interval.tv_sec = its.it_value.tv_sec; > its.it_interval.tv_nsec = its.it_value.tv_nsec; > > > > Is similar to: > > > ((int*)&sev.sigev_notify)[1] = thread->osthread()->thread_id(); > clockid_t clock; > int err = pthread_getcpuclockid(thread->osthread()->pthread_id(), &clock); > if (err != 0) { > log_error(jfr)("Failed to get clock for thread sampling: %s", os::strerror(err)); > return false; > } > if (timer_create(clock, &sev, &t) < 0) { > return false; > } > int64_t period = get_sampling_period(); > if (period != 0) { > set_timer_time(t, period); > } The `sigev_value` member is used to pass data that you can read in the signal handler. The address of `t` won't be valid anymore once you return from this function. In that example the address of `timerid ` is still valid. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126817165 From jbechberger at openjdk.org Wed Jun 4 14:59:33 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 14:59:33 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:54:44 GMT, Patricio Chilano Mateo wrote: >> Which differences to the example code are you seeing? >> >> >> sev.sigev_notify = SIGEV_SIGNAL; >> sev.sigev_signo = SIG; >> sev.sigev_value.sival_ptr = &timerid; >> if (timer_create(CLOCKID, &sev, &timerid) == -1) >> errExit("timer_create"); >> >> printf("timer ID is %#jx\n", (uintmax_t) timerid); >> >> /* Start the timer. */ >> >> freq_nanosecs = atoll(argv[2]); >> its.it_value.tv_sec = freq_nanosecs / 1000000000; >> its.it_value.tv_nsec = freq_nanosecs % 1000000000; >> its.it_interval.tv_sec = its.it_value.tv_sec; >> its.it_interval.tv_nsec = its.it_value.tv_nsec; >> >> >> >> Is similar to: >> >> >> ((int*)&sev.sigev_notify)[1] = thread->osthread()->thread_id(); >> clockid_t clock; >> int err = pthread_getcpuclockid(thread->osthread()->pthread_id(), &clock); >> if (err != 0) { >> log_error(jfr)("Failed to get clock for thread sampling: %s", os::strerror(err)); >> return false; >> } >> if (timer_create(clock, &sev, &t) < 0) { >> return false; >> } >> int64_t period = get_sampling_period(); >> if (period != 0) { >> set_timer_time(t, period); >> } > > The `sigev_value` member is used to pass data that you can read in the signal handler. The address of `t` won't be valid anymore once you return from this function. In that example the address of `timerid ` is still valid. Why is this a problem? We don't leak `&t` outside of `create_timer_for_thread`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126820560 From jbechberger at openjdk.org Wed Jun 4 15:06:16 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:06:16 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v53] In-Reply-To: References: Message-ID: <40QHcJ9vY8NkXNVIhF-84WqMoSTRwxGsQfENcTJNEHI=.d1753def-3f74-4646-a574-952612dddaec@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve create_timer_for_thread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/20b8db28..45f915b4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=52 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=51-52 Stats: 5 lines in 1 file changed: 0 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 15:06:16 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:06:16 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:56:10 GMT, Johannes Bechberger wrote: >> The `sigev_value` member is used to pass data that you can read in the signal handler. The address of `t` won't be valid anymore once you return from this function. In that example the address of `timerid ` is still valid. > > Why is this a problem? We don't leak `&t` outside of `create_timer_for_thread`. But I can start using the passed-in parameter `timerid` directly, which should make the code less confusing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126829360 From jbechberger at openjdk.org Wed Jun 4 15:19:09 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:19:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 15:00:07 GMT, Johannes Bechberger wrote: >> Why is this a problem? We don't leak `&t` outside of `create_timer_for_thread`. > > But I can start using the passed-in parameter `timerid` directly, which should make the code less confusing. I pushed the modifications. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126863235 From jbechberger at openjdk.org Wed Jun 4 15:32:25 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:32:25 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v54] In-Reply-To: References: Message-ID: <8m6_JMIlbofV7pazQz-ZmfzLDOmVt-bXpmOMtQQFhpg=.121c7602-81c0-4a8a-81ea-dc753bd4ca87@github.com> > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve create_timer_for_thread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/45f915b4..8f237898 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=53 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=52-53 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 15:32:26 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:32:26 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 15:23:55 GMT, Andrei Pangin wrote: >> I pushed the modifications. > > Simply set `sival_ptr` to `NULL`, the value is never used. You're right, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126886264 From apangin at openjdk.org Wed Jun 4 15:32:26 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 15:32:26 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 15:16:16 GMT, Johannes Bechberger wrote: >> But I can start using the passed-in parameter `timerid` directly, which should make the code less confusing. > > I pushed the modifications. Simply set `sival_ptr` to `NULL`, the value is never used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126880196 From shade at openjdk.org Wed Jun 4 15:32:26 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 15:32:26 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 12:56:22 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Fix build Some more cosmetics/nits. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 264: > 262: } > 263: timer_delete(*timer); > 264: tl->unset_cpu_timer(); Should this deletion be right in `unset_cpu_timer`? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 729: > 727: #else > 728: > 729: static bool _showed_warning = false; `_displayed_warning`? Actually, I think you can move this straight into `warn()` body. src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 250: > 248: } > 249: > 250: Unnecessary? src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 582: > 580: } > 581: > 582: bool JfrThreadLocal::acquire_cpu_time_jfr_enqueue_lock() { This sounds like `try_acquire_cpu_time_jfr_enqueue_lock`, emphasis on `try_`. It does not actually guarantee to lock. src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 586: > 584: } > 585: > 586: bool JfrThreadLocal::try_acquire_cpu_time_jfr_dequeue_lock() { ...and this one is not `try_`, but the actual "blocking" acquire. ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2896958686 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126882197 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126868308 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126736956 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126831195 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126844713 From jbechberger at openjdk.org Wed Jun 4 15:32:26 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:32:26 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:22:05 GMT, Aleksey Shipilev wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix build > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 250: > >> 248: } >> 249: >> 250: > > Unnecessary? Yes ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126890693 From jbechberger at openjdk.org Wed Jun 4 15:40:06 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:40:06 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 15:07:32 GMT, Aleksey Shipilev wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix build > > src/hotspot/share/jfr/support/jfrThreadLocal.cpp line 586: > >> 584: } >> 585: >> 586: bool JfrThreadLocal::try_acquire_cpu_time_jfr_dequeue_lock() { > > ...and this one is not `try_`, but the actual "blocking" acquire. But it fails if the lock state is already on `DEQUEUE`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2126907622 From jbechberger at openjdk.org Wed Jun 4 15:48:31 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 15:48:31 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v55] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/8f237898..3a486817 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=54 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=53-54 Stats: 11 lines in 4 files changed: 2 ins; 4 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Wed Jun 4 17:09:08 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 17:09:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:24:04 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 330: >> >>> 328: void JfrCPUSamplerThread::stackwalk_threads_in_native() { >>> 329: ResourceMark rm; >>> 330: MutexLocker tlock(Threads_lock); >> >> What exactly are we guarding against by holding the `Threads_lock`? Seems `ThreadsListHandle` should be enough. > > You're right. The Threads_lock is required to prevent JFR from sampling through an ongoing safepoint, touching oops, which is not supported by most GCs as well as JFR evolving its global epoch (happens during safepoint) while both threads are outside the safepoint protocol. Can be optimized (later), see for example: https://github.com/openjdk/jdk/pull/25602 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127062451 From jbechberger at openjdk.org Wed Jun 4 17:22:37 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 17:22:37 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v56] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Readd lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/3a486817..53d3ed4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=55 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=54-55 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 17:22:37 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 17:22:37 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: <-YJusSw4Vn4Dx_is7mRjOT6clvj3Uh5F5tcLsPIwUAk=.fda1546c-22b0-4a57-900e-2e752841b617@github.com> On Wed, 4 Jun 2025 17:04:15 GMT, Markus Gr?nlund wrote: >> You're right. > > The Threads_lock is required to prevent JFR from sampling through an ongoing safepoint, touching oops, which is not supported by most GCs as well as JFR evolving its global epoch (happens during safepoint) while both threads are outside the safepoint protocol. Can be optimized (later), see for example: https://github.com/openjdk/jdk/pull/25602 I readded the lock. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127082776 From shade at openjdk.org Wed Jun 4 17:27:08 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 17:27:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v55] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 15:48:31 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve I admit I am leaning pretty hard on JFR folks expertise here. I agree this code passes the bar for experimental feature: it does not seem to affect non-JFR paths, does not seem to interact with JFR in obviously incorrect manner, and in itself looks more or less sensible. Please file the follow-ups to figure out the memory ordering story in `JfrCPUTimeTraceQueue`. src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 640: > 638: > 639: VM_CPUTimeSamplerThreadInitializer(JfrCPUSamplerThread* sampler) : _sampler(sampler) { > 640: } Suggestion: private: JfrCPUSamplerThread* const _sampler; public: VM_CPUTimeSamplerThreadInitializer(JfrCPUSamplerThread* sampler) : _sampler(sampler) {} src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 671: > 669: > 670: VM_CPUTimeSamplerThreadTerminator(JfrCPUSamplerThread* sampler) : _sampler(sampler) { > 671: } Suggestion: private: JfrCPUSamplerThread* const _sampler; public: VM_CPUTimeSamplerThreadTerminator(JfrCPUSamplerThread* sampler) : _sampler(sampler) {} src/hotspot/share/runtime/vmOperation.hpp line 119: > 117: template(RendezvousGCThreads) \ > 118: template(CPUTimeSamplerThreadInitializer) \ > 119: template(CPUTimeSamplerThreadTerminator) \ I think these better be prefixed with `JFR`. E.g.: `JFRInitializeCPUTimeSampler` / `JFRTerminateCPUTimeSampler`? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2897511962 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127078325 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127079549 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127090498 From liach at openjdk.org Wed Jun 4 17:43:03 2025 From: liach at openjdk.org (Chen Liang) Date: Wed, 4 Jun 2025 17:43:03 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v8] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 07:12:42 GMT, Kim Barrett wrote: >> Please review this change which adds a native method providing the >> implementation of Reference::get. Referece::get is an intrinsic candidate, so >> this native method implementation is only used when the intrinsic is not. >> >> Currently there is intrinsic support by the interpreter, C1, C2, and graal, >> which are always used. With this change we can later remove all the >> per-platform interpreter intrinsic implementations, and might also remove the >> C1 intrinsic implementation. >> >> Testing: >> (1) mach5 tier1-6 normal (so using all the existing intrinsics). >> (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: > > - Merge branch 'master' into native-reference-get > - make private native Reference.get0 the intrinsic > - Merge branch 'master' into native-reference-get > - Merge branch 'master' into native-reference-get > - use new waitForRefProc, some tidying > - Merge branch 'master' into native-reference-get > - remove timeout by using waitForReferenceProcessing > - make ill-timed gc in non-concurrent case less likely > - fix test package use > - add package decl to test > - ... and 3 more: https://git.openjdk.org/jdk/compare/8d780d2e...98056a8b src/hotspot/share/classfile/vmSymbols.hpp line 425: > 423: template(pc_name, "pc") \ > 424: template(cs_name, "cs") \ > 425: template(get_name, "get") \ A bit surprised that this is not used elsewhere - but as long as this passes compilation I guess this is fine. src/hotspot/share/interpreter/abstractInterpreter.hpp line 86: > 84: java_lang_math_fmaF, // implementation of java.lang.Math.fma (x, y, z) > 85: java_lang_math_fmaD, // implementation of java.lang.Math.fma (x, y, z) > 86: java_lang_ref_reference_get0, // implementation of java.lang.ref.Reference.get() Suggestion: java_lang_ref_reference_get0, // implementation of java.lang.ref.Reference.get() src/hotspot/share/opto/compile.cpp line 786: > 784: initial_gvn()->set_type_bottom(s); > 785: verify_start(s); > 786: if (method()->intrinsic_id() == vmIntrinsics::_Reference_get) { Should we remove this now or as part of the redundant intrinsic cleanup for interpreter and c1? I see the interpreter is now kept intact. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24315#discussion_r2127108041 PR Review Comment: https://git.openjdk.org/jdk/pull/24315#discussion_r2127114784 PR Review Comment: https://git.openjdk.org/jdk/pull/24315#discussion_r2127117794 From jbechberger at openjdk.org Wed Jun 4 17:49:27 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 17:49:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v57] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with three additional commits since the last revision: - Renaming of VM ops - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp Co-authored-by: Aleksey Shipil?v ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/53d3ed4a..f3ef7908 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=56 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=55-56 Stats: 16 lines in 2 files changed: 0 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 17:49:27 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 17:49:27 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v55] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 17:24:07 GMT, Aleksey Shipilev wrote: > Please file the follow-ups to figure out the memory ordering story in `JfrCPUTimeTraceQueue`. I created a preliminary issue at https://bugs.openjdk.org/browse/JDK-8358616 > src/hotspot/share/runtime/vmOperation.hpp line 119: > >> 117: template(RendezvousGCThreads) \ >> 118: template(CPUTimeSamplerThreadInitializer) \ >> 119: template(CPUTimeSamplerThreadTerminator) \ > > I think these better be prefixed with `JFR`. E.g.: `JFRInitializeCPUTimeSampler` / `JFRTerminateCPUTimeSampler`? Good point ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2940863550 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127113247 From mgronlun at openjdk.org Wed Jun 4 17:49:28 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 17:49:28 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v56] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 17:22:37 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Readd lock src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 300: > 298: if (!Atomic::cmpxchg(&_disenrolled, false, true)) { > 299: log_trace(jfr)("Disenrolling CPU thread sampler"); > 300: if (Atomic::fetch_then_and(&_signal_handler_installed, false)) { fetch_then_and with false? Must be simpler way to express this? Like Atomic::load(&_signal_handler_installed)? src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 608: > 606: void JfrCPUSamplerThread::stop_signal_handlers() { > 607: // set the stop signal bit > 608: Atomic::or_then_fetch(&_active_signal_handlers, STOP_SIGNAL_BIT, memory_order_acq_rel); Whatever was fetched is gone with the wind... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127117357 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127119596 From jbechberger at openjdk.org Wed Jun 4 17:49:28 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 17:49:28 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v56] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 17:39:38 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Readd lock > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 300: > >> 298: if (!Atomic::cmpxchg(&_disenrolled, false, true)) { >> 299: log_trace(jfr)("Disenrolling CPU thread sampler"); >> 300: if (Atomic::fetch_then_and(&_signal_handler_installed, false)) { > > fetch_then_and with false? Must be simpler way to express this? Like Atomic::load(&_signal_handler_installed)? I also want to set it. Wanted to do a simple ::xchg, but this doesn't exist for booleans. > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 608: > >> 606: void JfrCPUSamplerThread::stop_signal_handlers() { >> 607: // set the stop signal bit >> 608: Atomic::or_then_fetch(&_active_signal_handlers, STOP_SIGNAL_BIT, memory_order_acq_rel); > > Whatever was fetched is gone with the wind... I didn't find an Atomic::or method ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127120548 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127122509 From mgronlun at openjdk.org Wed Jun 4 17:49:28 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 17:49:28 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v56] In-Reply-To: References: Message-ID: <-1_gEsqoLf4kh7TUZbIXh0n4PwcU7_IuaFMedlvMaAM=.27f3a6c4-79f6-4535-912a-9d2e69587a41@github.com> On Wed, 4 Jun 2025 17:41:46 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 300: >> >>> 298: if (!Atomic::cmpxchg(&_disenrolled, false, true)) { >>> 299: log_trace(jfr)("Disenrolling CPU thread sampler"); >>> 300: if (Atomic::fetch_then_and(&_signal_handler_installed, false)) { >> >> fetch_then_and with false? Must be simpler way to express this? Like Atomic::load(&_signal_handler_installed)? > > I also want to set it. Wanted to do a simple ::xchg, but this doesn't exist for booleans. 1 & 0 -> 0, are you setting _signal_handler_installed to false? Why? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127125344 From iklam at openjdk.org Wed Jun 4 17:51:55 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 4 Jun 2025 17:51:55 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Wed, 4 Jun 2025 12:13:45 GMT, Matthias Baesken wrote: > The test AOTCodeCompressedOopsTest.java has the memory error mentioned above fixed now with recent changes , but shows another issue > > ``` > runtime/cds/appcds/aotCode/AOTCodeCompressedOopsTest.java > --------------------------------------------------------------- > java.lang.RuntimeException: Pattern "narrow_oop_base = 0x(\\d+), narrow_oop_shift = (\\d)" not found in the output > at AOTCodeCompressedOopsTest$Tester.checkExecution(AOTCodeCompressedOopsTest.java:184) > at jdk.test.lib.cds.CDSAppTester.executeAndCheck(CDSAppTester.java:221) > at jdk.test.lib.cds.CDSAppTester.productionRun(CDSAppTester.java:427) > at jdk.test.lib.cds.CDSAppTester.productionRun(CDSAppTester.java:392) > at AOTCodeCompressedOopsTest.main(AOTCodeCompressedOopsTest.java:58) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) > at java.base/java.lang.reflect.Method.invoke(Method.java:565) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) > at java.base/java.lang.Thread.run(Thread.java:1474) > ``` > > Maybe we should ask an AOT expert about this, not sure what that means. Please file a bug against this test and we will look into it. I haven't used asan before. Is it as simple as adding `--enable-asan` when running `configure`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2940883167 From jbechberger at openjdk.org Wed Jun 4 17:54:50 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 17:54:50 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v58] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve disenroll ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/f3ef7908..db093a28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=57 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=56-57 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 17:54:51 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 17:54:51 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v56] In-Reply-To: <-1_gEsqoLf4kh7TUZbIXh0n4PwcU7_IuaFMedlvMaAM=.27f3a6c4-79f6-4535-912a-9d2e69587a41@github.com> References: <-1_gEsqoLf4kh7TUZbIXh0n4PwcU7_IuaFMedlvMaAM=.27f3a6c4-79f6-4535-912a-9d2e69587a41@github.com> Message-ID: On Wed, 4 Jun 2025 17:44:45 GMT, Markus Gr?nlund wrote: >> I also want to set it. Wanted to do a simple ::xchg, but this doesn't exist for booleans. > > 1 & 0 -> 0, are you setting _signal_handler_installed to false? Why? You're right, it's not required. I removed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127135816 From vlivanov at openjdk.org Wed Jun 4 18:08:57 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 4 Jun 2025 18:08:57 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v8] In-Reply-To: References: Message-ID: <3W47upHu1No4yeE-HHeJobnmONh-za_namTyrrIWWao=.32752508-95c8-4c1e-b261-3e435cdad79f@github.com> On Wed, 4 Jun 2025 07:12:42 GMT, Kim Barrett wrote: >> Please review this change which adds a native method providing the >> implementation of Reference::get. Referece::get is an intrinsic candidate, so >> this native method implementation is only used when the intrinsic is not. >> >> Currently there is intrinsic support by the interpreter, C1, C2, and graal, >> which are always used. With this change we can later remove all the >> per-platform interpreter intrinsic implementations, and might also remove the >> C1 intrinsic implementation. >> >> Testing: >> (1) mach5 tier1-6 normal (so using all the existing intrinsics). >> (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: > > - Merge branch 'master' into native-reference-get > - make private native Reference.get0 the intrinsic > - Merge branch 'master' into native-reference-get > - Merge branch 'master' into native-reference-get > - use new waitForRefProc, some tidying > - Merge branch 'master' into native-reference-get > - remove timeout by using waitForReferenceProcessing > - make ill-timed gc in non-concurrent case less likely > - fix test package use > - add package decl to test > - ... and 3 more: https://git.openjdk.org/jdk/compare/3c35a7ee...98056a8b Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24315#pullrequestreview-2897649218 From mgronlun at openjdk.org Wed Jun 4 18:41:14 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 18:41:14 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: On Wed, 4 Jun 2025 11:25:44 GMT, Andrei Pangin wrote: >> Is this something for a later PR? > > I'm OK with fixing this separately. Please ensure that you file a follow-up issue on this matter. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127213771 From jbechberger at openjdk.org Wed Jun 4 18:57:07 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 18:57:07 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: <3ALqkSc9a0HKOJrA6CW61v725SxX8FcLmasC8Wm4y24=.9d40a91c-a77a-442b-926a-e5785032c415@github.com> Message-ID: <0PjEBxgqHesjNBoyKSYCW6wGcS2GEqujEz79IeWl_LA=.04c24d78-dfb4-4917-bc50-59420548b854@github.com> On Wed, 4 Jun 2025 18:38:42 GMT, Markus Gr?nlund wrote: >> I'm OK with fixing this separately. > > Please ensure that you file a follow-up issue on this matter. Here it is https://bugs.openjdk.org/browse/JDK-8358619 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127234787 From pchilanomate at openjdk.org Wed Jun 4 18:57:08 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 18:57:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: <-YJusSw4Vn4Dx_is7mRjOT6clvj3Uh5F5tcLsPIwUAk=.fda1546c-22b0-4a57-900e-2e752841b617@github.com> References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> <-YJusSw4Vn4Dx_is7mRjOT6clvj3Uh5F5tcLsPIwUAk=.fda1546c-22b0-4a57-900e-2e752841b617@github.com> Message-ID: On Wed, 4 Jun 2025 17:17:31 GMT, Johannes Bechberger wrote: >> The Threads_lock is required to prevent JFR from sampling through an ongoing safepoint, touching oops, which is not supported by most GCs as well as JFR evolving its global epoch (happens during safepoint) while both threads are outside the safepoint protocol. Can be optimized (later), see for example: https://github.com/openjdk/jdk/pull/25602 > > I readded the lock. Thanks, a comment about that would be nice. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127239230 From pchilanomate at openjdk.org Wed Jun 4 18:57:09 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 18:57:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v50] In-Reply-To: References: <4_cKaFGWs_Wf0mcRY-lbaEn5i_DJfUoqpaNPhF8E_pw=.b82280fe-ed5b-42f0-85af-6dd15d297ba0@github.com> Message-ID: On Wed, 4 Jun 2025 14:39:47 GMT, Johannes Bechberger wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 430: >> >>> 428: void JfrCPUTimeThreadSampling::create_sampler(double rate, bool auto_adapt) { >>> 429: assert(_sampler == nullptr, "invariant"); >>> 430: _sampler = new JfrCPUSamplerThread(rate, auto_adapt); >> >> If we start a recording on an already running process we have a race here where a new thread can create and set its timer before we call init_timers() where the signal handler is installed. In that case the program will terminate with message ?Profiling timer expired" (default action for SIGPROF). It can be easily reproduced by adding a delay here and starting a recording on a simple test that just creates new threads. We need to add some extra check in create_timer_for_thread() or install the signal handler earlier. > > I created a `has_timer` flag that is checked by `create_timer_for_thread()` before creating timers and set by `init_timers`. Is this what you envisioned? Thanks, I verified that fixes the issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127238263 From mgronlun at openjdk.org Wed Jun 4 19:08:40 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 19:08:40 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v58] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 17:54:50 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve disenroll src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 396: > 394: } > 395: > 396: class SampleMonitor : public StackObj { This is a merge error. SampleMonitor is no longer used and can be deleted. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127252688 From jbechberger at openjdk.org Wed Jun 4 19:08:40 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 19:08:40 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v59] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: - Fixed merge error - Add comment regarding Threads_lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/db093a28..b9def278 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=58 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=57-58 Stats: 19 lines in 2 files changed: 1 ins; 18 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From apangin at openjdk.org Wed Jun 4 19:15:09 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 19:15:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v58] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 17:54:50 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve disenroll src/hotspot/os/posix/signals_posix.cpp line 1511: > 1509: struct sigaction oact; > 1510: if (sigaction(sig, (struct sigaction*)nullptr, &oact) == -1) { > 1511: return nullptr; // signal not installed A comment is misleading: sigaction does not fail if a handler for the signal is not installed (i.e. the handler is SIG_IGN). src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 656: > 654: if ((prev_handler != SIG_DFL && prev_handler != SIG_IGN && prev_handler != (void*)::handle_timer_signal) || > 655: PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == (void*)-1) { > 656: log_error(jfr)("CPUTimeSample events will not be recorded: %p", prev_handler); A message with some random hex address may look cryptic. Maybe make it a bit more user-friendly? E.g. Conflicting SIGPROF handler found: %p. CPUTimeSample events will not be recorded src/hotspot/share/jfr/periodic/sampling/jfrSampleRequest.cpp line 333: > 331: } > 332: } > 333: request._sample_ticks = JfrTicks::now(); For accurate correlation with other events, timestamp of a sample should be taken as early as possible, preferably in the beginning of `JfrCPUSamplerThread::handle_timer_signal` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127222795 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127266822 PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127236706 From mgronlun at openjdk.org Wed Jun 4 19:15:10 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 19:15:10 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v59] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:08:40 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision: > > - Fixed merge error > - Add comment regarding Threads_lock src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 633: > 631: > 632: void JfrCPUSamplerThread::allow_signal_handlers() { > 633: Atomic::release_store(&_active_signal_handlers, (u4)0); "allow_signal_handlers" is not the best name for this routine - it suggests that by setting _active_signal_handlers = 0, something is enabled - like an inverted Semaphore. But "signal handlers" are allowed for all W of this value, so the 0 has no bearing on enablement. Just call it "initialize_active_signal_handler_counter" or whatever. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127267710 From mgronlun at openjdk.org Wed Jun 4 19:18:08 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 19:18:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v58] In-Reply-To: References: Message-ID: <3MKLMJ4-eLHkm1qroOSGU2Kvi1pIYkyeZ9unRVbhHAk=.afeb3c4a-4501-4f70-82b6-f293948bc229@github.com> On Wed, 4 Jun 2025 19:11:16 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve disenroll > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 656: > >> 654: if ((prev_handler != SIG_DFL && prev_handler != SIG_IGN && prev_handler != (void*)::handle_timer_signal) || >> 655: PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == (void*)-1) { >> 656: log_error(jfr)("CPUTimeSample events will not be recorded: %p", prev_handler); > > A message with some random hex address may look cryptic. > Maybe make it a bit more user-friendly? E.g. > > Conflicting SIGPROF handler found: %p. CPUTimeSample events will not be recorded Thanks Andrei. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127272864 From jbechberger at openjdk.org Wed Jun 4 19:24:11 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 19:24:11 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v58] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 18:44:23 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve disenroll > > src/hotspot/os/posix/signals_posix.cpp line 1511: > >> 1509: struct sigaction oact; >> 1510: if (sigaction(sig, (struct sigaction*)nullptr, &oact) == -1) { >> 1511: return nullptr; // signal not installed > > A comment is misleading: sigaction does not fail if a handler for the signal is not installed (i.e. the handler is SIG_IGN). Good catch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127281710 From jbechberger at openjdk.org Wed Jun 4 19:33:08 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 19:33:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v58] In-Reply-To: <3MKLMJ4-eLHkm1qroOSGU2Kvi1pIYkyeZ9unRVbhHAk=.afeb3c4a-4501-4f70-82b6-f293948bc229@github.com> References: <3MKLMJ4-eLHkm1qroOSGU2Kvi1pIYkyeZ9unRVbhHAk=.afeb3c4a-4501-4f70-82b6-f293948bc229@github.com> Message-ID: On Wed, 4 Jun 2025 19:15:19 GMT, Markus Gr?nlund wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 656: >> >>> 654: if ((prev_handler != SIG_DFL && prev_handler != SIG_IGN && prev_handler != (void*)::handle_timer_signal) || >>> 655: PosixSignals::install_generic_signal_handler(SIG, (void*)::handle_timer_signal) == (void*)-1) { >>> 656: log_error(jfr)("CPUTimeSample events will not be recorded: %p", prev_handler); >> >> A message with some random hex address may look cryptic. >> Maybe make it a bit more user-friendly? E.g. >> >> Conflicting SIGPROF handler found: %p. CPUTimeSample events will not be recorded > > Thanks Andrei. Fixed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127292777 From jbechberger at openjdk.org Wed Jun 4 19:33:09 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 19:33:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v58] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 18:53:35 GMT, Andrei Pangin wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve disenroll > > src/hotspot/share/jfr/periodic/sampling/jfrSampleRequest.cpp line 333: > >> 331: } >> 332: } >> 333: request._sample_ticks = JfrTicks::now(); > > For accurate correlation with other events, timestamp of a sample should be taken as early as possible, preferably in the beginning of `JfrCPUSamplerThread::handle_timer_signal` I moved it up: void JfrCPUSamplerThread::handle_timer_signal(siginfo_t* info, void* context) { JfrTicks now = JfrTicks::now(); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127294191 From jbechberger at openjdk.org Wed Jun 4 19:38:57 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 19:38:57 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v60] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/b9def278..ab2ac459 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=59 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=58-59 Stats: 22 lines in 4 files changed: 5 ins; 1 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Wed Jun 4 19:38:58 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 19:38:58 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v42] In-Reply-To: References: Message-ID: <0T2FCxivdxhNus6A9_fGcXDNFDLachuVRGgJbixpU_0=.d51a569d-6f68-4dfa-a0cb-ca71c85f9286@github.com> On Tue, 3 Jun 2025 21:56:41 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename autoadapt > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 360: > >> 358: tl->set_do_async_processing_of_cpu_time_jfr_requests(false); >> 359: if (lock) { >> 360: tl->acquire_cpu_time_jfr_dequeue_lock(); > > This is your synchronization point on return from native code, which is effectively a spinlock. This can cause problems when a large number of threads are being processed by the "do_async_processing" request call. > > We should fix this as a bug after integration (use a proper Monitor as a synchronization point). Filed: https://bugs.openjdk.org/browse/JDK-8358621 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127300632 From mgronlun at openjdk.org Wed Jun 4 19:50:08 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 19:50:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v60] In-Reply-To: References: Message-ID: <399lO47PUBfDUPjh-Ve1NztHstmkEF24xiJVQe3FRZI=.693af08b-3df9-4ec5-a589-a1c269bb7837@github.com> On Wed, 4 Jun 2025 19:38:57 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 634: > 632: > 633: void JfrCPUSamplerThread::allow_signal_handlers() { > 634: Atomic::release_store(&_initialize_active_signal_handler_counter, (u4)0); I think we might be getting a bit tired. I meant the function name, not the variable name, which now looks completely bonkers :-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127323120 From jbechberger at openjdk.org Wed Jun 4 19:55:55 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 19:55:55 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Renaming ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/ab2ac459..b1689bdf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=60 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=59-60 Stats: 11 lines in 1 file changed: 0 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302 From jbechberger at openjdk.org Wed Jun 4 19:55:56 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 19:55:56 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v60] In-Reply-To: <399lO47PUBfDUPjh-Ve1NztHstmkEF24xiJVQe3FRZI=.693af08b-3df9-4ec5-a589-a1c269bb7837@github.com> References: <399lO47PUBfDUPjh-Ve1NztHstmkEF24xiJVQe3FRZI=.693af08b-3df9-4ec5-a589-a1c269bb7837@github.com> Message-ID: On Wed, 4 Jun 2025 19:47:17 GMT, Markus Gr?nlund wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 634: > >> 632: >> 633: void JfrCPUSamplerThread::allow_signal_handlers() { >> 634: Atomic::release_store(&_initialize_active_signal_handler_counter, (u4)0); > > I think we might be getting a bit tired. I meant the function name, not the variable name, which now looks completely bonkers :-) Well, I just misunderstood you :D ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127329186 From cjplummer at openjdk.org Wed Jun 4 20:12:54 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 4 Jun 2025 20:12:54 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan serviceability/sa/ClhsdbCDSCore.java explicitly says it did not create a core file: `# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again` Is there a similar message for serviceability/sa/ClhsdbFindPC.java? What about all the other SA core file tests? Here's a list of all the SA core file tests: serviceability/sa/ClhsdbCDSCore.java serviceability/sa/ClhsdbFindPC.java#xcomp-core serviceability/sa/ClhsdbFindPC.java#no-xcomp-core serviceability/sa/ClhsdbPmap.java#core serviceability/sa/ClhsdbPstack.java#core serviceability/sa/TestJmapCore.java serviceability/sa/TestJmapCoreMetaspace.java ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2941326294 From mdoerr at openjdk.org Wed Jun 4 20:48:07 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 4 Jun 2025 20:48:07 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: <8jQ9fsDS_OZA6WYBEy5Rf9WPfZaZJQB1zWB53iHt5i8=.6ebbb1cc-0c6f-4699-88cc-f93e8d9b5a82@github.com> On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming Marked as reviewed by mdoerr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2898056841 From apangin at openjdk.org Wed Jun 4 21:22:07 2025 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 4 Jun 2025 21:22:07 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming Marked as reviewed by apangin (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2898137302 From iveresov at openjdk.org Wed Jun 4 21:26:54 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 4 Jun 2025 21:26:54 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` Sure why not, we'll eventually need it. ------------- Marked as reviewed by iveresov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25599#pullrequestreview-2898147796 From shade at openjdk.org Wed Jun 4 21:34:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 21:34:54 GMT Subject: RFR: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: <_ZKbtodgRybiZ7NdyASnaBvSO9pZSUoQIhb2iFBjZ7M=.ac6f9bc1-c3e7-49ab-94d5-07f3e2093f3a@github.com> On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` Cool, here goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25599#issuecomment-2941555963 From shade at openjdk.org Wed Jun 4 21:34:55 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 21:34:55 GMT Subject: Integrated: 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 In-Reply-To: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> References: <0FmJVTYdAq7vmsOi4pi9NRKHm5MfmIrotPucldzsZj4=.b1335bca-f439-4c18-aa1c-6c69548d095d@github.com> Message-ID: On Mon, 2 Jun 2025 18:41:42 GMT, Aleksey Shipilev wrote: > Found this when reading mainline-vs-premain webrev. [JDK-8355003](https://bugs.openjdk.org/browse/JDK-8355003) introduced a backlink to `Method*` in `MethodCounters`. I believe we need to handle that backlink at least in `CodeBuffer::finalize_oop_references()`. premain does this, while mainline does not. Also, amusingly, we have `MethodCounters::is_methodCounters`, but not the super-class `Metadata::is_methodCounters`. > > I pulled in the hunks that use `is_methodCounters()` and `MethodCounters::method()` from premain into this PR. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `runtime/cds` > - [x] Linux x86_64 server fastdebug, `tier1` > - [x] Linux x86_64 server fastdebug, `all` This pull request has now been integrated. Changeset: 3cf3e4bb Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/3cf3e4bbec26a84d77cb7a3125a60ba1e1e4ee97 Stats: 10 lines in 3 files changed: 10 ins; 0 del; 0 mod 8358339: Handle MethodCounters::_method backlinks after JDK-8355003 Reviewed-by: coleenp, kvn, iveresov ------------- PR: https://git.openjdk.org/jdk/pull/25599 From dholmes at openjdk.org Wed Jun 4 21:46:53 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 4 Jun 2025 21:46:53 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 11:06:31 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> Please see the JIRA issue for a detailed description. >> >> Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). >> >> Testing: jdk_jfr, JVMTI PopFrame tests >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > more precise comment Thanks for clarifying the problem for me in JBS. Fix looks good. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25571#pullrequestreview-2898185925 From mgronlun at openjdk.org Wed Jun 4 21:47:08 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 21:47:08 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: <-H5MPWW5BYrLTBLZIGu8PRcWUw9LxZoLpapnZaDy5ts=.295d136b-d764-4a73-b334-4fb51295a580@github.com> On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming I am waiting for the basic CI pre-tests to come back green before I approve for integration. No sweat, the cutoff is in CST :) and branching usually happens later in the day. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941623120 From mgronlun at openjdk.org Wed Jun 4 22:02:19 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 22:02:19 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming Ship it! ------------- Marked as reviewed by mgronlun (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2898222436 From pchilanomate at openjdk.org Wed Jun 4 22:02:19 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 22:02:19 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming Marked as reviewed by pchilanomate (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25302#pullrequestreview-2898222664 From jbechberger at openjdk.org Wed Jun 4 22:02:19 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 22:02:19 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming I know, but I want to get over it today(ish). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941671138 From mgronlun at openjdk.org Wed Jun 4 22:02:19 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 22:02:19 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 21:56:04 GMT, Johannes Bechberger wrote: > I know, but I want to get over it today(ish). Go for it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941686873 From mgronlun at openjdk.org Wed Jun 4 22:13:14 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 22:13:14 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 22:06:51 GMT, Johannes Bechberger wrote: > So I'm just waiting for Skara? You can issue the integrate command. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941709310 From mgronlun at openjdk.org Wed Jun 4 22:13:14 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 22:13:14 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming Well done - now take some well-deserved rest. I'll monitor the first stages of integration. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941716415 From jbechberger at openjdk.org Wed Jun 4 22:13:14 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 22:13:14 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming So I'm just waiting for Skara? Thanks, good night. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941707891 PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941716783 From jbechberger at openjdk.org Wed Jun 4 22:13:17 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 4 Jun 2025 22:13:17 GMT Subject: Integrated: 8342818: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Mon, 19 May 2025 13:02:20 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes This pull request has now been integrated. Changeset: 5b27e9c2 Author: Johannes Bechberger URL: https://git.openjdk.org/jdk/commit/5b27e9c2df8b386b38b0553d941469cd8aa65c28 Stats: 2306 lines in 41 files changed: 2166 ins; 128 del; 12 mod 8342818: Implement JEP 509: JFR CPU-Time Profiling Reviewed-by: mgronlun, mdoerr, pchilanomate, apangin, shade ------------- PR: https://git.openjdk.org/jdk/pull/25302 From mgronlun at openjdk.org Wed Jun 4 22:47:15 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 22:47:15 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming Hope you have not got to sleep yet: jdk/jfr/jcmd/TestJcmdStartStopDefault.java fails on Linux. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2941820906 From mgronlun at openjdk.org Wed Jun 4 23:47:28 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 23:47:28 GMT Subject: RFR: 8358628: [BACKOUT] 8342818: Implement JEP 509: JFR CPU-Time Profiling Message-ID: This reverts commit 5b27e9c2df8b386b38b0553d941469cd8aa65c28. There are too many strange issues appearing, memory allocation corruptions, and also some strange activation of the feature that is yet not understood. We should back it out as to not block other integrations while we troubleshoot. Thanks Markus ------------- Commit messages: - Revert "8342818: Implement JEP 509: JFR CPU-Time Profiling" Changes: https://git.openjdk.org/jdk/pull/25649/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25649&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358628 Stats: 2306 lines in 41 files changed: 128 ins; 2166 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25649.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25649/head:pull/25649 PR: https://git.openjdk.org/jdk/pull/25649 From pchilanomate at openjdk.org Wed Jun 4 23:52:51 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Jun 2025 23:52:51 GMT Subject: RFR: 8358628: [BACKOUT] 8342818: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 23:38:07 GMT, Markus Gr?nlund wrote: > This reverts commit 5b27e9c2df8b386b38b0553d941469cd8aa65c28. > > There are too many strange issues appearing, memory allocation corruptions, and also some strange activation of the feature that is yet not understood. > > We should back it out as to not block other integrations while we troubleshoot. > > Thanks > Markus Marked as reviewed by pchilanomate (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25649#pullrequestreview-2898371339 From dholmes at openjdk.org Wed Jun 4 23:57:56 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 4 Jun 2025 23:57:56 GMT Subject: RFR: 8358628: [BACKOUT] 8342818: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <62oqwYVSGqRO_SeQhJ7Ylhgo9Y_M6fbTDVLYSkb8cK4=.95eba6bd-99ae-41ad-86e3-51cfcf16b0cd@github.com> On Wed, 4 Jun 2025 23:38:07 GMT, Markus Gr?nlund wrote: > This reverts commit 5b27e9c2df8b386b38b0553d941469cd8aa65c28. > > There are too many strange issues appearing, memory allocation corruptions, and also some strange activation of the feature that is yet not understood. > > We should back it out as to not block other integrations while we troubleshoot. > > Thanks > Markus Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25649#pullrequestreview-2898378246 From mgronlun at openjdk.org Wed Jun 4 23:57:56 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 4 Jun 2025 23:57:56 GMT Subject: Integrated: 8358628: [BACKOUT] 8342818: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 23:38:07 GMT, Markus Gr?nlund wrote: > This reverts commit 5b27e9c2df8b386b38b0553d941469cd8aa65c28. > > There are too many strange issues appearing, memory allocation corruptions, and also some strange activation of the feature that is yet not understood. > > We should back it out as to not block other integrations while we troubleshoot. > > Thanks > Markus This pull request has now been integrated. Changeset: 9186cc73 Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk/commit/9186cc7310c0cca2fca776031280f08d84e43b74 Stats: 2306 lines in 41 files changed: 128 ins; 2166 del; 12 mod 8358628: [BACKOUT] 8342818: Implement JEP 509: JFR CPU-Time Profiling Reviewed-by: pchilanomate, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25649 From kvn at openjdk.org Thu Jun 5 00:25:09 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 5 Jun 2025 00:25:09 GMT Subject: RFR: 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes Message-ID: AOTCodeCache::copy_bytes() tries to optimize by copying byte buffer using HeapWords (64bits) by rounding up the size which may access memory after buffer. We should use memcpy() instead. I also fixed output match pattern in test because oop base address is hexadecimal value. I fixed in in leyden/premain branch and forgot to port into mainline. During testing the fix I hit this issue. Testing tier1-3,xcomp,stress ------------- Commit messages: - 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes Changes: https://git.openjdk.org/jdk/pull/25651/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25651&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358632 Stats: 13 lines in 2 files changed: 0 ins; 9 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25651.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25651/head:pull/25651 PR: https://git.openjdk.org/jdk/pull/25651 From vlivanov at openjdk.org Thu Jun 5 00:33:50 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 5 Jun 2025 00:33:50 GMT Subject: RFR: 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 00:18:50 GMT, Vladimir Kozlov wrote: > AOTCodeCache::copy_bytes() tries to optimize by copying byte buffer using HeapWords (64bits) by rounding up the size which may access memory after buffer. > > We should use memcpy() instead. > > I also fixed output match pattern in test because oop base address is hexadecimal value. I fixed in in leyden/premain branch and forgot to port into mainline. During testing the fix I hit this issue. > > Testing tier1-3,xcomp,stress Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25651#pullrequestreview-2898562570 From kvn at openjdk.org Thu Jun 5 01:07:51 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 5 Jun 2025 01:07:51 GMT Subject: RFR: 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes In-Reply-To: References: Message-ID: <6I8UUZbavaUJ4-YytgSd9WcOBB5dfI_d7pFn-mbgCc4=.c6b66979-42f3-4b77-9a59-970c7250b04a@github.com> On Thu, 5 Jun 2025 00:18:50 GMT, Vladimir Kozlov wrote: > AOTCodeCache::copy_bytes() tries to optimize by copying byte buffer using HeapWords (64bits) by rounding up the size which may access memory after buffer. > > We should use memcpy() instead. > > I also fixed output match pattern in test because oop base address is hexadecimal value. I fixed in in leyden/premain branch and forgot to port into mainline. During testing the fix I hit this issue. > > Testing tier1-3,xcomp,stress Thank you, Vladimir ------------- PR Comment: https://git.openjdk.org/jdk/pull/25651#issuecomment-2942364593 From dholmes at openjdk.org Thu Jun 5 01:12:14 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 5 Jun 2025 01:12:14 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 88: > 86: return false; > 87: } > 88: } while (Atomic::cmpxchg(&_head, elementIndex, elementIndex + 1) != elementIndex); Why do we need atomic operations if we hold the enqueue lock. ?? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127755502 From iveresov at openjdk.org Thu Jun 5 01:24:48 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Thu, 5 Jun 2025 01:24:48 GMT Subject: RFR: 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 00:18:50 GMT, Vladimir Kozlov wrote: > AOTCodeCache::copy_bytes() tries to optimize by copying byte buffer using HeapWords (64bits) by rounding up the size which may access memory after buffer. > > We should use memcpy() instead. > > I also fixed output match pattern in test because oop base address is hexadecimal value. I fixed in in leyden/premain branch and forgot to port into mainline. During testing the fix I hit this issue. > > Testing tier1-3,xcomp,stress Marked as reviewed by iveresov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25651#pullrequestreview-2898617070 From kvn at openjdk.org Thu Jun 5 01:57:48 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 5 Jun 2025 01:57:48 GMT Subject: RFR: 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 00:18:50 GMT, Vladimir Kozlov wrote: > AOTCodeCache::copy_bytes() tries to optimize by copying byte buffer using HeapWords (64bits) by rounding up the size which may access memory after buffer. > > We should use memcpy() instead. > > I also fixed output match pattern in test because oop base address is hexadecimal value. I fixed in in leyden/premain branch and forgot to port into mainline. During testing the fix I hit this issue. > > Testing tier1-3,xcomp,stress Thank you, Igor. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25651#issuecomment-2942487833 From syan at openjdk.org Thu Jun 5 02:54:49 2025 From: syan at openjdk.org (SendaoYan) Date: Thu, 5 Jun 2025 02:54:49 GMT Subject: RFR: 8358004: Delete applications/scimark/Scimark.java test In-Reply-To: <_o6ne7B1-A7nSOva7Zns8IMv7rw0Ve4xTwqiSajNzcA=.755977c7-4376-415c-9f76-5acae9f12b5a@github.com> References: <_o6ne7B1-A7nSOva7Zns8IMv7rw0Ve4xTwqiSajNzcA=.755977c7-4376-415c-9f76-5acae9f12b5a@github.com> Message-ID: On Tue, 20 May 2025 02:55:44 GMT, Leonid Mesnik wrote: > Test > scimark has a bug, described in the https://bugs.openjdk.org/browse/JDK-8315797 > that causes test failure. > > The Scimark is not maintained. The main goal of test was to provide example of Artifact-based test with 3rd party binary. There are a couple of other tests using Artifactory. So this test is completely useless now. > I am removing it just to avoid spending time for anyone who can run test and observe this failure. Does this PR remove the file `test/hotspot/jtreg/applications/scimark/Scimark.java` or remove the directory `test/hotspot/jtreg/applications/scimark` ------------- PR Comment: https://git.openjdk.org/jdk/pull/25316#issuecomment-2942579847 From dzhang at openjdk.org Thu Jun 5 03:12:47 2025 From: dzhang at openjdk.org (Dingli Zhang) Date: Thu, 5 Jun 2025 03:12:47 GMT Subject: RFR: 8358634: RISC-V: Fix several broken documentation web-links Message-ID: Hi all, Please take a look and review this PR, thanks! Several RISC-V related documentation web-links are broken. This PR updates the web-links. ------------- Commit messages: - 8358634: RISC-V: Fix several broken documentation web-links Changes: https://git.openjdk.org/jdk/pull/25652/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25652&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358634 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25652.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25652/head:pull/25652 PR: https://git.openjdk.org/jdk/pull/25652 From syan at openjdk.org Thu Jun 5 03:17:54 2025 From: syan at openjdk.org (SendaoYan) Date: Thu, 5 Jun 2025 03:17:54 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan Changes requested by syan (Committer). src/hotspot/share/prims/whitebox.cpp line 1100: > 1098: #endif > 1099: > 1100: bool WhiteBox::is_asan_enabled() { Does we need `bool WhileBox::is_lsan_enable()` to check '--enable-lsan enable LeakSanitizer' enable or not test/hotspot/jtreg/TEST.ROOT line 94: > 92: vm.compiler2.enabled \ > 93: vm.musl \ > 94: vm.asan \ Do we need `vm.lsan` ------------- PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2898739689 PR Review Comment: https://git.openjdk.org/jdk/pull/25575#discussion_r2127856663 PR Review Comment: https://git.openjdk.org/jdk/pull/25575#discussion_r2127857129 From fyang at openjdk.org Thu Jun 5 03:26:52 2025 From: fyang at openjdk.org (Fei Yang) Date: Thu, 5 Jun 2025 03:26:52 GMT Subject: RFR: 8358634: RISC-V: Fix several broken documentation web-links In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 03:08:14 GMT, Dingli Zhang wrote: > Hi all, > Please take a look and review this PR, thanks! > > Several RISC-V related documentation web-links are broken. > This PR updates the web-links. Looks good and trivial. Thanks1 ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25652#pullrequestreview-2898751102 From kvn at openjdk.org Thu Jun 5 03:28:57 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 5 Jun 2025 03:28:57 GMT Subject: Integrated: 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 00:18:50 GMT, Vladimir Kozlov wrote: > AOTCodeCache::copy_bytes() tries to optimize by copying byte buffer using HeapWords (64bits) by rounding up the size which may access memory after buffer. > > We should use memcpy() instead. > > I also fixed output match pattern in test because oop base address is hexadecimal value. I fixed in in leyden/premain branch and forgot to port into mainline. During testing the fix I hit this issue. > > Testing tier1-3,xcomp,stress This pull request has now been integrated. Changeset: 849655a1 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/849655a145a40b056a751528cebc78a11481514c Stats: 13 lines in 2 files changed: 0 ins; 9 del; 4 mod 8358632: [asan] reports heap-buffer-overflow in AOTCodeCache::copy_bytes Reviewed-by: vlivanov, iveresov ------------- PR: https://git.openjdk.org/jdk/pull/25651 From kbarrett at openjdk.org Thu Jun 5 03:45:50 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 5 Jun 2025 03:45:50 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 08:43:51 GMT, Julian Waters wrote: > If it's easier I can bring the original change to noexcept Pull Request back from the dead and remove the merge mistakes that leaked in from my other branch, which shouldn't really be that difficult to do. PR 15910 seems a mess; let's just leave that alone. My inclination is to *not* treat the update as a simple sed replacement. That's easy to author, but not so easy to (not rubberstamp) review. I'd like to actually look at the various operators, since I've already found one recent one that wasn't done properly. And doing that while maintaining adequate review focus means breaking it up into more bite-sized chunks, at least for me. > Not sure which code is potentially marked throw() wrongly though. new (and quickly fixed): https://bugs.openjdk.org/browse/JDK-8358283 Inconsistent failure mode for MetaspaceObj::operator new(size_t, MemTag) old (recently self-assigned) https://bugs.openjdk.org/browse/JDK-8342639 Global operator new in adlc has wrong exception spec There might have been some others noticed and probably fixed during 8305590. And I don't know whether there are any other recent-ish additions. > Alternatively, we could just keep throw() alongside noexcept for code that already uses it, to avoid code churn. They do mean the same thing in C++17, after all In other words, modify this style guide change to allow (or even prefer) `throw()` rather than `noexcept`? I don't think we want to be using both. > (I was going to mention that there are papers for static exception specifications that propose reintroducing throw() back into C++ last I remembered, but realized that this likely doesn't mean much for us now, so this point can be ignored) Yeah, I really don't care about something speculative for inclusion in a not even existing yet Standard. I'm just trying to prepare us for C++17 at this point. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2942639056 From jbechberger at openjdk.org Thu Jun 5 05:05:22 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 05:05:22 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 01:09:01 GMT, David Holmes wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Renaming > > src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 88: > >> 86: return false; >> 87: } >> 88: } while (Atomic::cmpxchg(&_head, elementIndex, elementIndex + 1) != elementIndex); > > Why do we need atomic operations if we hold the enqueue lock. ?? Valid Point, I was probably over cautious. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2127957148 From jwaters at openjdk.org Thu Jun 5 05:05:51 2025 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 5 Jun 2025 05:05:51 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Thu, 5 Jun 2025 03:42:46 GMT, Kim Barrett wrote: > PR 15910 seems a mess; let's just leave that alone. I understand, I'll leave it be in that case. > In other words, modify this style guide change to allow (or even prefer) `throw()` rather than `noexcept`? Not really allow or prefer throw() to noexcept in the style guide, more just leave the existing uses of throw() alone once this style guide change goes in. Kind of like how alignas and C++11 attributes are now permitted within HotSpot, but the old usages of __attribute__ or __declspec, and ATTRIBUTE_ALIGNED weren't all changed to use [[]] attributes or alignas because that would be rather troublesome. Then again, there are less throw() currently in the code than there are attributes or ATTRIBUTE_ALIGNED, so maybe this isn't as big of an issue as I think it is. > I'm just trying to prepare us for C++17 at this point. :) C++17 has been long overdue by this point in my eyes, hopefully you'll be more successful than I was at switching the JDK to it :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2942757329 From jbechberger at openjdk.org Thu Jun 5 05:13:15 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 05:13:15 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v61] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 19:55:55 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Renaming I'm on it, darn. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2942770368 From lliu at openjdk.org Thu Jun 5 05:18:52 2025 From: lliu at openjdk.org (Liming Liu) Date: Thu, 5 Jun 2025 05:18:52 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v2] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 08:47:13 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4355: >> >>> 4353: add(buf, buf, 32); >>> 4354: crc32x(crc, crc, tmp2); >>> 4355: subs(len, len, 32); >> >> What is the point of these changes? > > To be more precise: converting these adjustments to post-increment operations isn't obviously an improvement on AArch64 generally. How does it help? According to perf, post-increment ops help to reduce the access to TLB on Ampere1 in this case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2127971345 From iklam at openjdk.org Thu Jun 5 05:20:58 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 5 Jun 2025 05:20:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Tue, 3 Jun 2025 07:16:47 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Moved jtreg test > - Improved documentation > - Fix coding style (asterisk placement) I have written a POC that shows that the table must be sorted again when dumping a dynamic CDS archive. See https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f Explanations are in [here](https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f#diff-fd7608607ecf305bb3535b500bff5d53ec216d2da25e3bad1a1d699f56b09283R199) I will create an RFE for the JDK mainline that adds built-in debugging support for the `(oldSym > newSym_orig)` condition as describe in the POC. Please wait for that before integrating this PR. I can help you write the code for re-sorting the tables. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2942786527 From lliu at openjdk.org Thu Jun 5 05:52:22 2025 From: lliu at openjdk.org (Liming Liu) Date: Thu, 5 Jun 2025 05:52:22 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v3] In-Reply-To: References: Message-ID: > This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. > > 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. > > 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. > > The performance regressions and improvements were measured with the following microbenchmarks: > org.openjdk.bench.java.util.TestCRC32.testCRC32Update > org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate > > Ran the following JTReg tests on Ampere1 and did not find problems: > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Use uint for the option and assert it >= 256 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25609/files - new: https://git.openjdk.org/jdk/pull/25609/files/db926eb0..9b2bae68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=01-02 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25609.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25609/head:pull/25609 PR: https://git.openjdk.org/jdk/pull/25609 From lliu at openjdk.org Thu Jun 5 05:52:23 2025 From: lliu at openjdk.org (Liming Liu) Date: Thu, 5 Jun 2025 05:52:23 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v2] In-Reply-To: <5BGf1eIVeMQIaLXIoOvcuQlBiaPeWojv8HAnfuOiW_E=.8c39a6ac-c0f2-40fe-bef3-be0a6bd71c07@github.com> References: <5BGf1eIVeMQIaLXIoOvcuQlBiaPeWojv8HAnfuOiW_E=.8c39a6ac-c0f2-40fe-bef3-be0a6bd71c07@github.com> Message-ID: On Wed, 4 Jun 2025 08:27:10 GMT, Emanuel Peter wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Make it be a diagnostic flag > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4335: > >> 4333: assert_different_registers(crc, buf, len, tmp0, tmp1, tmp2); >> 4334: >> 4335: subs(tmp0, len, CryptoPmullForCRC32LowLimit); > > Would it make sense to have another alignment sanity check here? It would be both helpful to make sure nobody later breaks your assumption, and could also be helpful for the reader to see the `128` alignment immediately. I think the alignment does not effect the correctness here, but it should be >= 256. So I added the corresponding assertion above. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2128003307 From mbaesken at openjdk.org Thu Jun 5 05:52:51 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 5 Jun 2025 05:52:51 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan > Do we need vm.lsan I worked a bit with ubsan and asan in recent months (and with MSAN https://clang.llvm.org/docs/MemorySanitizer.html too but that is not yet supported in OpenJDK). Regarding LSAN I haven't looked into it. Maybe we can add this later if needed, I would restrict this PR to asan/ubsan . ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2942852876 From mbaesken at openjdk.org Thu Jun 5 06:11:51 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 5 Jun 2025 06:11:51 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Wed, 4 Jun 2025 17:48:58 GMT, Ioi Lam wrote: >Please file a bug against this test and we will look into it. Thanks, I created https://bugs.openjdk.org/browse/JDK-8358654 . >I haven't used asan before. Is it as simple as adding --enable-asan when running configure? On my Ubuntu and RHEL Linux with gcc yes , it is that simple (besides installing the asan package with OS package manager). On SUSE Linux it was a little tricky because of some strange issues with the network stack, but it works too there (see https://bugs.openjdk.org/browse/JDK-8356970 ). So it depends a little on the distro but mostly it works fine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2942895056 From amitkumar at openjdk.org Thu Jun 5 06:15:36 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 5 Jun 2025 06:15:36 GMT Subject: RFR: 8358653: [s390] Clean up comments regarding frame manager Message-ID: Basic comment cleanup; replaces "frame manager" by "template interpreter". ------------- Commit messages: - cleanup Changes: https://git.openjdk.org/jdk/pull/25653/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25653&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358653 Stats: 15 lines in 6 files changed: 0 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/25653.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25653/head:pull/25653 PR: https://git.openjdk.org/jdk/pull/25653 From jbechberger at openjdk.org Thu Jun 5 06:17:09 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 06:17:09 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling Message-ID: This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with - ... different heap sizes - ... different GCs - ... different samplers (the standard JFR and the new CPU Time Sampler and both) - ... different JFR recording durations - ... different chunk-sizes ------------- Commit messages: - Last minute bug fixes - Renaming - Improve - Fixed merge error - Add comment regarding Threads_lock - Improve disenroll - Renaming of VM ops - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp - Update src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp - Readd lock - ... and 137 more: https://git.openjdk.org/jdk/compare/7838321b...706afe47 Changes: https://git.openjdk.org/jdk/pull/25654/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25654&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8342818 Stats: 2319 lines in 41 files changed: 2179 ins; 128 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25654.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25654/head:pull/25654 PR: https://git.openjdk.org/jdk/pull/25654 From jbechberger at openjdk.org Thu Jun 5 06:20:55 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 06:20:55 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <3n33jyQz0KFm_pWmetgPb9oy3qN6tmleF71Xh6SKCLo=.7bd375e9-6e88-4f6a-aa5c-6f11752d4fdb@github.com> On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes This PR contains only the bug fixes for [JDK-8358628](https://bugs.openjdk.org/browse/JDK-8358628) compared to https://github.com/openjdk/jdk/pull/25302. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942911239 From jbechberger at openjdk.org Thu Jun 5 06:32:52 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 06:32:52 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes src/jdk.jfr/share/classes/jdk/jfr/internal/PlatformEventType.java line 203: > 201: this.cpuRate = rate; > 202: if (isEnabled()) { > 203: JVM.setCPUThrottle(rate.rate(), rate.autoAdapt()); but we need to set the throttle somewhere? Else changes are not propagated? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25654#discussion_r2128053153 From jbechberger at openjdk.org Thu Jun 5 06:32:52 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 06:32:52 GMT Subject: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <2GSkwBas8IEqyzGVYkytg-yP2JKp2CvMNtNOH_jbnos=.dd69c21e-caa7-48a6-9832-d6d4f97ba558@github.com> On Thu, 5 Jun 2025 06:29:15 GMT, Johannes Bechberger wrote: >> This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). >> >> Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with >> >> - ... different heap sizes >> - ... different GCs >> - ... different samplers (the standard JFR and the new CPU Time Sampler and both) >> - ... different JFR recording durations >> - ... different chunk-sizes > > src/jdk.jfr/share/classes/jdk/jfr/internal/PlatformEventType.java line 203: > >> 201: this.cpuRate = rate; >> 202: if (isEnabled()) { >> 203: JVM.setCPUThrottle(rate.rate(), rate.autoAdapt()); > > but we need to set the throttle somewhere? Else changes are not propagated? I checked and it seems to work as expected ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25654#discussion_r2128055000 From mgronlun at openjdk.org Thu Jun 5 06:43:52 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 5 Jun 2025 06:43:52 GMT Subject: RFR: 8358666: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <-tISK3RWA_OvFK-j5QFJ2tM4rGXNKQgPYrSbmJCHZfI=.1be77f23-aa95-4ca6-ae0e-3a33d94991da@github.com> On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes That call was what caused the issues to begin with, so I'm slightly nervous about it. Perhaps it's correct, but again.... No, you can't reopen it. Its already integrated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942944604 PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942945444 PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942951223 From hgreule at openjdk.org Thu Jun 5 06:43:52 2025 From: hgreule at openjdk.org (Hannes Greule) Date: Thu, 5 Jun 2025 06:43:52 GMT Subject: RFR: 8358666: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes I think you need a [REDO issue](https://openjdk.org/guide/#how-to-work-with-jbs-when-a-change-is-backed-out). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942947798 From jbechberger at openjdk.org Thu Jun 5 06:43:52 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 06:43:52 GMT Subject: RFR: 8358666: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes I tested it and it works as expected. This is similar to what the normal JFR profiler does. We have two scenarios: 1. Not enabled: then setting the throttle only sets the value in the event, and the enable call picks it up. 2. Enabled: We push it directly into CPU Time Sampler ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942948123 From jbechberger at openjdk.org Thu Jun 5 06:43:53 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 06:43:53 GMT Subject: RFR: 8358666: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:37:33 GMT, Hannes Greule wrote: > I think you need a [REDO issue](https://openjdk.org/guide/#how-to-work-with-jbs-when-a-change-is-backed-out). I reopened it, isn't this enough? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942949098 From mgronlun at openjdk.org Thu Jun 5 06:43:54 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 5 Jun 2025 06:43:54 GMT Subject: RFR: 8358666: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: <2GSkwBas8IEqyzGVYkytg-yP2JKp2CvMNtNOH_jbnos=.dd69c21e-caa7-48a6-9832-d6d4f97ba558@github.com> References: <2GSkwBas8IEqyzGVYkytg-yP2JKp2CvMNtNOH_jbnos=.dd69c21e-caa7-48a6-9832-d6d4f97ba558@github.com> Message-ID: On Thu, 5 Jun 2025 06:30:39 GMT, Johannes Bechberger wrote: >> src/jdk.jfr/share/classes/jdk/jfr/internal/PlatformEventType.java line 203: >> >>> 201: this.cpuRate = rate; >>> 202: if (isEnabled()) { >>> 203: JVM.setCPUThrottle(rate.rate(), rate.autoAdapt()); >> >> but we need to set the throttle somewhere? Else changes are not propagated? > > I checked and it seems to work as expected @egahlin is the expert on the settings system. Please guide here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25654#discussion_r2128069800 From jbechberger at openjdk.org Thu Jun 5 06:46:52 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 06:46:52 GMT Subject: RFR: 8358666: Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <1RKF1agm66fFl9uB-6UekuK_CkD8Y9Bn9U-jpOukWDQ=.9e34f31d-6c39-495b-a0d6-dde07df85c31@github.com> On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes I created a new issue and linked it ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2942960048 From mbaesken at openjdk.org Thu Jun 5 06:51:51 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 5 Jun 2025 06:51:51 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Thu, 5 Jun 2025 03:14:22 GMT, SendaoYan wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> TestBreakSignalThreadDump has issues with asan > > test/hotspot/jtreg/TEST.ROOT line 94: > >> 92: vm.compiler2.enabled \ >> 93: vm.musl \ >> 94: vm.asan \ > > Do we need `vm.lsan` Maybe later in a separate PR . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25575#discussion_r2128081241 From mbaesken at openjdk.org Thu Jun 5 06:58:33 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 5 Jun 2025 06:58:33 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v3] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: AOTCodeCompressedOopsTest will be handled separately ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/aa796c8a..8b9e3dde Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Thu Jun 5 07:00:54 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 5 Jun 2025 07:00:54 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 08:07:38 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > TestBreakSignalThreadDump has issues with asan I removed the added requires from AOTCodeCompressedOopsTest because we will look into this separately in https://bugs.openjdk.org/browse/JDK-8358654 . ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2942989553 From mbaesken at openjdk.org Thu Jun 5 07:06:51 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 5 Jun 2025 07:06:51 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Wed, 4 Jun 2025 20:09:53 GMT, Chris Plummer wrote: > serviceability/sa/ClhsdbCDSCore.java explicitly says it did not create a core file The test passes with asan enabled on Linux ppc64le , so probably the failure seen on Linux x86_64 was more machine/environment related than ASAN related ; should I better remove the asan requires from this test ? Regarding the other serviceability/sa tests you mentioned, seems they work too on Linux ppc64le so they **_can_** work with ASAN , no requires needed for now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2943002454 From dzhang at openjdk.org Thu Jun 5 07:06:51 2025 From: dzhang at openjdk.org (Dingli Zhang) Date: Thu, 5 Jun 2025 07:06:51 GMT Subject: RFR: 8358634: RISC-V: Fix several broken documentation web-links In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 03:23:50 GMT, Fei Yang wrote: >> Hi all, >> Please take a look and review this PR, thanks! >> >> Several RISC-V related documentation web-links are broken. >> This PR updates the web-links. > > Looks good and trivial. Thanks1 @RealFYang Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25652#issuecomment-2943001575 From duke at openjdk.org Thu Jun 5 07:06:52 2025 From: duke at openjdk.org (duke) Date: Thu, 5 Jun 2025 07:06:52 GMT Subject: RFR: 8358634: RISC-V: Fix several broken documentation web-links In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 03:08:14 GMT, Dingli Zhang wrote: > Hi all, > Please take a look and review this PR, thanks! > > Several RISC-V related documentation web-links are broken. > This PR updates the web-links. @DingliZhang Your change (at version bc6c9bd6bf22860b12cf865a282c85d736cd5d76) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25652#issuecomment-2943003076 From egahlin at openjdk.org Thu Jun 5 07:14:55 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Thu, 5 Jun 2025 07:14:55 GMT Subject: RFR: 8358666: [Redo] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <6wUfrfcgUKIFCLynr4UNyNOC53YjUiiOX1ClG36KIb0=.83c09b06-3efc-4ed7-be94-b544f05a88b8@github.com> On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes The settings change looks reasonable. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2943023044 From lliu at openjdk.org Thu Jun 5 07:15:34 2025 From: lliu at openjdk.org (Liming Liu) Date: Thu, 5 Jun 2025 07:15:34 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: References: Message-ID: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> > This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. > > 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. > > 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. > > The performance regressions and improvements were measured with the following microbenchmarks: > org.openjdk.bench.java.util.TestCRC32.testCRC32Update > org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate > > Ran the following JTReg tests on Ampere1 and did not find problems: > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Add the message for the assertions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25609/files - new: https://git.openjdk.org/jdk/pull/25609/files/9b2bae68..df9f920a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25609.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25609/head:pull/25609 PR: https://git.openjdk.org/jdk/pull/25609 From dzhang at openjdk.org Thu Jun 5 07:38:02 2025 From: dzhang at openjdk.org (Dingli Zhang) Date: Thu, 5 Jun 2025 07:38:02 GMT Subject: Integrated: 8358634: RISC-V: Fix several broken documentation web-links In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 03:08:14 GMT, Dingli Zhang wrote: > Hi all, > Please take a look and review this PR, thanks! > > Several RISC-V related documentation web-links are broken. > This PR updates the web-links. This pull request has now been integrated. Changeset: 48b97ac0 Author: Dingli Zhang Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/48b97ac0e006362528423ffd657b2ea3afa46a6e Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod 8358634: RISC-V: Fix several broken documentation web-links Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/25652 From dholmes at openjdk.org Thu Jun 5 07:52:25 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 5 Jun 2025 07:52:25 GMT Subject: RFR: 8346914: UB issue in scalbnA Message-ID: This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: // Convert to unsigned to avoid signed integer overflow [1] unsigned u_k = ((unsigned) k) + n; [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); return x; } [5] if (u_k <= (unsigned)-54) { if (n > 50000) /* in case integer overflow in n+k */ return hugeX*copysignA(hugeX,x); /*overflow*/ else return tiny*copysignA(tiny,x); /*underflow*/ } [6] k = u_k + 54; /* subnormal result */ set_high(&x, (hx&0x800fffff)|(k<<20)); return x*twom54; [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range [3] Again we check `u_k` and adjust the range [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression [5] We check if `u_k` is logically less than what -54 would be [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` Thanks. ------------- Commit messages: - Merge branch 'master' into 8346914-scalbnA - remove test code - Fix plus testing changes Changes: https://git.openjdk.org/jdk/pull/25656/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25656&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346914 Stats: 11 lines in 1 file changed: 4 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25656.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25656/head:pull/25656 PR: https://git.openjdk.org/jdk/pull/25656 From mgronlun at openjdk.org Thu Jun 5 07:56:59 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 5 Jun 2025 07:56:59 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes I updated the JIRA issue to capitalized [REDO] which seems to be the more common process. Can you please update the PR title accordingly? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2943135661 From jbechberger at openjdk.org Thu Jun 5 08:00:53 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 08:00:53 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <09OE_6qHZFFOBpK_Sk_T3rpoTvqcSqtdHEglNPZ6obM=.d6d8db66-1cb0-4a7c-b964-de349f97069e@github.com> On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes Skara did it automatically ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2943146024 From mgronlun at openjdk.org Thu Jun 5 08:12:59 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 5 Jun 2025 08:12:59 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes I see that this PR includes my fixes for the issues I found during this night of debugging, as detailed in the backout issue: https://bugs.openjdk.org/browse/JDK-8358628 I am approving for that reason, else all that work would have been in vain. ------------- Marked as reviewed by mgronlun (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25654#pullrequestreview-2899291411 From mdoerr at openjdk.org Thu Jun 5 08:20:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 5 Jun 2025 08:20:57 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes As mentioned in https://github.com/openjdk/jdk/pull/25302, I think this is good enough for an experimental feature assuming the the tests are fine this time :-) ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25654#pullrequestreview-2899318226 From jbechberger at openjdk.org Thu Jun 5 08:20:58 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 08:20:58 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes I'm waiting for the last test to finish (just in case) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2943196010 From qamai at openjdk.org Thu Jun 5 08:20:59 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 5 Jun 2025 08:20:59 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes The JEP is targeted for JDK-25, if you are planning to integrate this today then I think it is too rushed. There is no need to rush this into JDK-25 instead of deferring it to JDK-26. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2943201530 From mgronlun at openjdk.org Thu Jun 5 08:20:59 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 5 Jun 2025 08:20:59 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <8YyAjD5ShFj3EQV0epnb9MJf7MYngNUfUsYidkFPIO0=.db319f98-79a7-4e85-965b-f9e67a7519c4@github.com> On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes You need two reviewers! sigh... ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2943206267 From jbechberger at openjdk.org Thu Jun 5 08:24:05 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 5 Jun 2025 08:24:05 GMT Subject: Integrated: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes This pull request has now been integrated. Changeset: ace70a6d Author: Johannes Bechberger URL: https://git.openjdk.org/jdk/commit/ace70a6d6aca619da34b2f9cac2586cc88cefb5a Stats: 2319 lines in 41 files changed: 2179 ins; 128 del; 12 mod 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling Reviewed-by: mgronlun ------------- PR: https://git.openjdk.org/jdk/pull/25654 From aboldtch at openjdk.org Thu Jun 5 08:35:51 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 5 Jun 2025 08:35:51 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 07:48:03 GMT, David Holmes wrote: > This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. > > Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. > > Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: > > // Convert to unsigned to avoid signed integer overflow > [1] unsigned u_k = ((unsigned) k) + n; > > [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ > [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ > [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); > return x; > } > > [5] if (u_k <= (unsigned)-54) { > if (n > 50000) /* in case integer overflow in n+k */ > return hugeX*copysignA(hugeX,x); /*overflow*/ > else return tiny*copysignA(tiny,x); /*underflow*/ > } > [6] k = u_k + 54; /* subnormal result */ > set_high(&x, (hx&0x800fffff)|(k<<20)); > return x*twom54; > > > [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition > > [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range > > [3] Again we check `u_k` and adjust the range > > [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression > > [5] We check if `u_k` is logically less than what -54 would be > > [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` > > Thanks. src/hotspot/share/runtime/sharedRuntimeMath.hpp line 122: > 120: set_high(&x, (hx&0x800fffff)|((k+n)<<20)); > 121: return x; > 122: } Curious if this could be? It looks like this is what the ifs are doing. And it is at least easier for me to see that this is the same as the code before. Suggestion: if ((int)u_k > 0) { if (u_k > 0x7fe) { return hugeX*copysignA(hugeX,x); /* overflow */ } set_high(&x, (hx&0x800fffff)|((k+n)<<20)); return x; } src/hotspot/share/runtime/sharedRuntimeMath.hpp line 124: > 122: } > 123: > 124: if (u_k <= (unsigned)-54) { Could this just be? Or is this less clear / easier to miss? Suggestion: if (u_k <= -54u) { ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128267650 PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128268545 From kbarrett at openjdk.org Thu Jun 5 08:42:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 5 Jun 2025 08:42:38 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v9] In-Reply-To: References: Message-ID: > Please review this change which adds a native method providing the > implementation of Reference::get. Referece::get is an intrinsic candidate, so > this native method implementation is only used when the intrinsic is not. > > Currently there is intrinsic support by the interpreter, C1, C2, and graal, > which are always used. With this change we can later remove all the > per-platform interpreter intrinsic implementations, and might also remove the > C1 intrinsic implementation. > > Testing: > (1) mach5 tier1-6 normal (so using all the existing intrinsics). > (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: fix comment alignment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24315/files - new: https://git.openjdk.org/jdk/pull/24315/files/98056a8b..edd4dec2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24315.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24315/head:pull/24315 PR: https://git.openjdk.org/jdk/pull/24315 From dchuyko at openjdk.org Thu Jun 5 09:47:00 2025 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 5 Jun 2025 09:47:00 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v2] In-Reply-To: References: <4gjCTX5GeYnhLOggsT2koqaeM1DdlJnwcQdSiR-3cZk=.beb2eccc-ac6d-48bb-a828-e58383799ea5@github.com> Message-ID: <5cRAuyV4iP9KMGzMSCF1xZGgXUbbzHcxiYAjfKYZIoM=.d76f9a7a-7f9d-4507-a55b-e5805bec9aae@github.com> On Fri, 30 May 2025 18:24:22 GMT, Dmitry Chuyko wrote: >> Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Copyright year >> - Review suggestions >> - Merge master >> - Delete empty line >> - SHA3 GPR intrinsic & tests > > GPR rol, rax and rax1 pseudo instructions were added in MacroAssembler. > > Main loop and "bcax"/Chi parts were extracted as functions. > > Main loop counter was put in fp register with fp decrement and fcmp (this variant does have a positive impact). > > Updated results from Graviton machines (Linux, intrinsic vs C2): > > Benchmark (digesterName) (length) Pct > G2 > MessageDigests.digest SHA3-256 64 +20.8% > MessageDigests.digest SHA3-256 16384 +27.2% > G3 > MessageDigests.digest SHA3-256 64 +12.8% > MessageDigests.digest SHA3-256 16384 +15.7% > G4 > MessageDigests.digest SHA3-256 64 +9.7% > MessageDigests.digest SHA3-256 16384 +13.2% > @dchuyko Thanks for working on this! I have quickly scanned the code, and it looks reasonable, though I am not an intrinsics specialist. I'll not run some internal testing, feel free to ping me again in 24h. @eme64 thanks, are the results ok? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24260#issuecomment-2943486394 From mgronlun at openjdk.org Thu Jun 5 10:14:52 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 5 Jun 2025 10:14:52 GMT Subject: RFR: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 06:25:09 GMT, Erik ?sterlund wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> more precise comment > > Looks good. Thanks @fisk and @dholmes-ora for your reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25571#issuecomment-2943580712 From mgronlun at openjdk.org Thu Jun 5 10:18:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 5 Jun 2025 10:18:00 GMT Subject: Integrated: 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 20:33:50 GMT, Markus Gr?nlund wrote: > Greetings, > > Please see the JIRA issue for a detailed description. > > Fix only applies to platforms that issue a save_bcp() as part of InterpreterMacroAssembler::unlock_object(). > > Testing: jdk_jfr, JVMTI PopFrame tests > > Thanks > Markus This pull request has now been integrated. Changeset: d450e341 Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk/commit/d450e341c7af910b618f3dd3e1f77e2e37702c5f Stats: 3 lines in 3 files changed: 3 ins; 0 del; 0 mod 8357962: JFR Cooperative Sampling reveals inconsistent interpreter frames as part of JVMTI PopFrame Reviewed-by: dholmes, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/25571 From mhaessig at openjdk.org Thu Jun 5 10:43:53 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 5 Jun 2025 10:43:53 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v5] In-Reply-To: References: Message-ID: <5JATdbvS9tsVNkUXeQX2NIwXr7-gU2wFOuaqmQfX_ZU=.3520a881-7c97-4f81-8f44-f20ba0517b90@github.com> On Tue, 3 Jun 2025 19:33:57 GMT, Cesar Soares Lucas wrote: >> Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. >> >> Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Revert change to attribute of make_not_entrant element Thank you for addressing the comments. It looks good now. I ran some additional testing that passed. ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/25338#pullrequestreview-2899791829 From fbredberg at openjdk.org Thu Jun 5 10:46:51 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 5 Jun 2025 10:46:51 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> On Thu, 5 Jun 2025 08:31:40 GMT, Axel Boldt-Christmas wrote: >> This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. >> >> Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. >> >> Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: >> >> // Convert to unsigned to avoid signed integer overflow >> [1] unsigned u_k = ((unsigned) k) + n; >> >> [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ >> [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ >> [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); >> return x; >> } >> >> [5] if (u_k <= (unsigned)-54) { >> if (n > 50000) /* in case integer overflow in n+k */ >> return hugeX*copysignA(hugeX,x); /*overflow*/ >> else return tiny*copysignA(tiny,x); /*underflow*/ >> } >> [6] k = u_k + 54; /* subnormal result */ >> set_high(&x, (hx&0x800fffff)|(k<<20)); >> return x*twom54; >> >> >> [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition >> >> [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range >> >> [3] Again we check `u_k` and adjust the range >> >> [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression >> >> [5] We check if `u_k` is logically less than what -54 would be >> >> [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` >> >> Thanks. > > src/hotspot/share/runtime/sharedRuntimeMath.hpp line 122: > >> 120: set_high(&x, (hx&0x800fffff)|((k+n)<<20)); >> 121: return x; >> 122: } > > Curious if this could be? It looks like this is what the ifs are doing. And it is at least easier for me to see that this is the same as the code before. > Suggestion: > > if ((int)u_k > 0) { > if (u_k > 0x7fe) { > return hugeX*copysignA(hugeX,x); /* overflow */ > } > set_high(&x, (hx&0x800fffff)|((k+n)<<20)); > return x; > } I like this change. It's easier to follow, at least for me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128540598 From fbredberg at openjdk.org Thu Jun 5 10:54:51 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 5 Jun 2025 10:54:51 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 08:32:10 GMT, Axel Boldt-Christmas wrote: >> This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. >> >> Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. >> >> Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: >> >> // Convert to unsigned to avoid signed integer overflow >> [1] unsigned u_k = ((unsigned) k) + n; >> >> [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ >> [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ >> [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); >> return x; >> } >> >> [5] if (u_k <= (unsigned)-54) { >> if (n > 50000) /* in case integer overflow in n+k */ >> return hugeX*copysignA(hugeX,x); /*overflow*/ >> else return tiny*copysignA(tiny,x); /*underflow*/ >> } >> [6] k = u_k + 54; /* subnormal result */ >> set_high(&x, (hx&0x800fffff)|(k<<20)); >> return x*twom54; >> >> >> [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition >> >> [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range >> >> [3] Again we check `u_k` and adjust the range >> >> [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression >> >> [5] We check if `u_k` is logically less than what -54 would be >> >> [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` >> >> Thanks. > > src/hotspot/share/runtime/sharedRuntimeMath.hpp line 124: > >> 122: } >> 123: >> 124: if (u_k <= (unsigned)-54) { > > Could this just be? Or is this less clear / easier to miss? > Suggestion: > > if (u_k <= -54u) { For me it's less clear and far easier to miss. I'll vote for `if (u_k <= (unsigned)-54) {`. Reading `if (u_k <= -54u) {` just cooks my brain. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128553634 From dchuyko at openjdk.org Thu Jun 5 12:11:40 2025 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 5 Jun 2025 12:11:40 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v8] In-Reply-To: References: Message-ID: > This is an implementation of SHA3 intrinsics for AArch64 that operates GPRs. It follows the Java implementation algorithm but eagerly uses available registers. For example, FP+R18 are used when it's allowed. On simpler cores like RPi3 or Surface Pro it is 23-53% faster than C2 compiled version; on Graviton 3 it is 8-14% faster than C2 compiled version (which is faster than the current intrinsic); on Apple Silicon it is faster than C2 compiled version but slower than the ARMv8.2-SHA intrinsic. Improvements on a particular CPU depend on the input length. For instance, for Graviton 2: > > > G2 > Benchmark (digesterName) (length) Pct > MessageDigests.digest SHA3-256 64 28.28% > MessageDigests.digest SHA3-256 16384 53.58% > MessageDigests.digest SHA3-512 64 27.97% > MessageDigests.digest SHA3-512 16384 43.90% > MessageDigests.getAndDigest SHA3-256 64 26.18% > MessageDigests.getAndDigest SHA3-256 16384 52.82% > MessageDigests.getAndDigest SHA3-512 64 24.73% > MessageDigests.getAndDigest SHA3-512 16384 44.31% > > > (results for intermediate input lengths look like steps) > > On Graviton 4 there is still a noticeable difference between the proposed implementation and C2 generated code: > > > G4 > Benchmark (digesterName) (length) Pct > MessageDigests.digest SHA3-256 64 8.3% > MessageDigests.digest SHA3-256 16384 11% > MessageDigests.digest SHA3-512 64 8.4% > MessageDigests.digest SHA3-512 16384 11.5% > MessageDigests.getAndDigest SHA3-256 64 7.2% > MessageDigests.getAndDigest SHA3-256 16384 11% > MessageDigests.getAndDigest SHA3-512 64 7.3% > MessageDigests.getAndDigest SHA3-512 16384 11.6% > > > and the version that uses the extension is ~1.8x slower than C2 > > Existing intrinsic implementation is put under a flag `UseSIMDForSHA3Intrinsic` which is on by default where the intrinsic is enabled currently. > > Sanity tests were modified to cover new intrinsic variants (`-XX:-UseSIMDForSHA3Intrinsic -XX:+-PreserveFramePointer`) on aarch64 hw. Existing test cases where intrinsic is enabled are executed with `-XX:+IgnoreUnrecognizedVMOptions -XX:+UseSIMDForSHA3Intrinsic`, on platforms where the sha3 extension is missing they still are cut off by isSHA3IntrinsicAvailable() predicate.... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge branch 'openjdk:master' into JDK-8337666 - No imm masking in rolw - Merge branch 'openjdk:master' into JDK-8337666 - Update src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp Co-authored-by: Andrew Haley - Update src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp Co-authored-by: Andrew Haley - Merge branch 'openjdk:master' into JDK-8337666 - Assert message - Copyright year - Review suggestions - Merge master - ... and 2 more: https://git.openjdk.org/jdk/compare/782bbca4...37bda3c2 ------------- Changes: https://git.openjdk.org/jdk/pull/24260/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24260&range=07 Stats: 749 lines in 6 files changed: 743 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24260.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24260/head:pull/24260 PR: https://git.openjdk.org/jdk/pull/24260 From aph at openjdk.org Thu Jun 5 12:28:54 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 5 Jun 2025 12:28:54 GMT Subject: RFR: 8337666: AArch64: SHA3 GPR intrinsic [v8] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 12:11:40 GMT, Dmitry Chuyko wrote: >> This is an implementation of SHA3 intrinsics for AArch64 that operates GPRs. It follows the Java implementation algorithm but eagerly uses available registers. For example, FP+R18 are used when it's allowed. On simpler cores like RPi3 or Surface Pro it is 23-53% faster than C2 compiled version; on Graviton 3 it is 8-14% faster than C2 compiled version (which is faster than the current intrinsic); on Apple Silicon it is faster than C2 compiled version but slower than the ARMv8.2-SHA intrinsic. Improvements on a particular CPU depend on the input length. For instance, for Graviton 2: >> >> >> G2 >> Benchmark (digesterName) (length) Pct >> MessageDigests.digest SHA3-256 64 28.28% >> MessageDigests.digest SHA3-256 16384 53.58% >> MessageDigests.digest SHA3-512 64 27.97% >> MessageDigests.digest SHA3-512 16384 43.90% >> MessageDigests.getAndDigest SHA3-256 64 26.18% >> MessageDigests.getAndDigest SHA3-256 16384 52.82% >> MessageDigests.getAndDigest SHA3-512 64 24.73% >> MessageDigests.getAndDigest SHA3-512 16384 44.31% >> >> >> (results for intermediate input lengths look like steps) >> >> On Graviton 4 there is still a noticeable difference between the proposed implementation and C2 generated code: >> >> >> G4 >> Benchmark (digesterName) (length) Pct >> MessageDigests.digest SHA3-256 64 8.3% >> MessageDigests.digest SHA3-256 16384 11% >> MessageDigests.digest SHA3-512 64 8.4% >> MessageDigests.digest SHA3-512 16384 11.5% >> MessageDigests.getAndDigest SHA3-256 64 7.2% >> MessageDigests.getAndDigest SHA3-256 16384 11% >> MessageDigests.getAndDigest SHA3-512 64 7.3% >> MessageDigests.getAndDigest SHA3-512 16384 11.6% >> >> >> and the version that uses the extension is ~1.8x slower than C2 >> >> Existing intrinsic implementation is put under a flag `UseSIMDForSHA3Intrinsic` which is on by default where the intrinsic is enabled currently. >> >> Sanity tests were modified to cover new intrinsic variants (`-XX:-UseSIMDForSHA3Intrinsic -XX:+-PreserveFramePointer`) on aarch64 hw. Existing test cases where intrinsic is enabled are executed with `-XX:+IgnoreUnrecognizedVMOptions -XX:+UseSIMDForSHA3Intrinsic`, on platforms where the sha3 extension ... > > Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge branch 'openjdk:master' into JDK-8337666 > - No imm masking in rolw > - Merge branch 'openjdk:master' into JDK-8337666 > - Update src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp > > Co-authored-by: Andrew Haley > - Update src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp > > Co-authored-by: Andrew Haley > - Merge branch 'openjdk:master' into JDK-8337666 > - Assert message > - Copyright year > - Review suggestions > - Merge master > - ... and 2 more: https://git.openjdk.org/jdk/compare/782bbca4...37bda3c2 OK, thanks. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24260#pullrequestreview-2900117365 From aph at openjdk.org Thu Jun 5 12:34:49 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 5 Jun 2025 12:34:49 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 07:48:03 GMT, David Holmes wrote: > Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. I'm somewhat perplexed by this. Should I conclude that you have never succeeded in triggering the UB, and that you do not know if UB ever occurs? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2944064370 From dholmes at openjdk.org Thu Jun 5 12:54:50 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 5 Jun 2025 12:54:50 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: <1ZsPg9Rm0zaXuh7qbT-l2aDbhj-ekJk7u5SIKHq-Mgk=.acb8187b-9653-47da-8978-14a7bb064a7b@github.com> On Thu, 5 Jun 2025 12:32:39 GMT, Andrew Haley wrote: > I'm somewhat perplexed by this. Should I conclude that you have never succeeded in triggering the UB, and that you do not know if UB ever occurs? Correct. The potential for UB was spotted by code inspection as Kim reported in the JBS issue. I was unable to determine what argument to what mathematical function (tan/sin/cos) would trigger the overflow paths. If there are any math gurus out there that would know, please speak up. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2944162992 From dholmes at openjdk.org Thu Jun 5 13:04:54 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 5 Jun 2025 13:04:54 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> References: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> Message-ID: <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> On Thu, 5 Jun 2025 10:44:22 GMT, Fredrik Bredberg wrote: >> src/hotspot/share/runtime/sharedRuntimeMath.hpp line 122: >> >>> 120: set_high(&x, (hx&0x800fffff)|((k+n)<<20)); >>> 121: return x; >>> 122: } >> >> Curious if this could be? It looks like this is what the ifs are doing. And it is at least easier for me to see that this is the same as the code before. >> Suggestion: >> >> if ((int)u_k > 0) { >> if (u_k > 0x7fe) { >> return hugeX*copysignA(hugeX,x); /* overflow */ >> } >> set_high(&x, (hx&0x800fffff)|((k+n)<<20)); >> return x; >> } > > I like this change. It's easier to follow, at least for me. I think that is equivalent, but I did not want to disrupt the general control flow of the algorithm so that it still closely resembles the original fdlibm code. I also initially misread your suggestion and now I am wondering if I can actually simplify it to a simple two line change: unsigned u_k = k + n; // avoid UB signed integer overflow k = (int) u_k; // safely assign to k does that bypass any UB? >> src/hotspot/share/runtime/sharedRuntimeMath.hpp line 124: >> >>> 122: } >>> 123: >>> 124: if (u_k <= (unsigned)-54) { >> >> Could this just be? Or is this less clear / easier to miss? >> Suggestion: >> >> if (u_k <= -54u) { > > For me it's less clear and far easier to miss. > I'll vote for `if (u_k <= (unsigned)-54) {`. > Reading `if (u_k <= -54u) {` just cooks my brain. I'm suprised that is even valid TBH. It strikes me as a numerical oxymoron. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128800502 PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128802270 From jsjolen at openjdk.org Thu Jun 5 14:05:50 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 5 Jun 2025 14:05:50 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> References: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> Message-ID: On Thu, 5 Jun 2025 13:00:48 GMT, David Holmes wrote: >> I like this change. It's easier to follow, at least for me. > > I think that is equivalent, but I did not want to disrupt the general control flow of the algorithm so that it still closely resembles the original fdlibm code. > > I also initially misread your suggestion and now I am wondering if I can actually simplify it to a simple two line change: > > unsigned u_k = k + n; // avoid UB signed integer overflow > k = (int) u_k; // safely assign to k > > does that bypass any UB? Why would it be OK to do `(k+n)<<20` but UB to do `k = k+n`? The addition may still overflow, which would be UB. >> For me it's less clear and far easier to miss. >> I'll vote for `if (u_k <= (unsigned)-54) {`. >> Reading `if (u_k <= -54u) {` just cooks my brain. > > I'm suprised that is even valid TBH. It strikes me as a numerical oxymoron. >"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two?s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ?end note ] So this is OK. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128940099 PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2128921557 From dchuyko at openjdk.org Thu Jun 5 14:31:00 2025 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 5 Jun 2025 14:31:00 GMT Subject: Integrated: 8337666: AArch64: SHA3 GPR intrinsic In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 15:55:59 GMT, Dmitry Chuyko wrote: > This is an implementation of SHA3 intrinsics for AArch64 that operates GPRs. It follows the Java implementation algorithm but eagerly uses available registers. For example, FP+R18 are used when it's allowed. On simpler cores like RPi3 or Surface Pro it is 23-53% faster than C2 compiled version; on Graviton 3 it is 8-14% faster than C2 compiled version (which is faster than the current intrinsic); on Apple Silicon it is faster than C2 compiled version but slower than the ARMv8.2-SHA intrinsic. Improvements on a particular CPU depend on the input length. For instance, for Graviton 2: > > > G2 > Benchmark (digesterName) (length) Pct > MessageDigests.digest SHA3-256 64 28.28% > MessageDigests.digest SHA3-256 16384 53.58% > MessageDigests.digest SHA3-512 64 27.97% > MessageDigests.digest SHA3-512 16384 43.90% > MessageDigests.getAndDigest SHA3-256 64 26.18% > MessageDigests.getAndDigest SHA3-256 16384 52.82% > MessageDigests.getAndDigest SHA3-512 64 24.73% > MessageDigests.getAndDigest SHA3-512 16384 44.31% > > > (results for intermediate input lengths look like steps) > > On Graviton 4 there is still a noticeable difference between the proposed implementation and C2 generated code: > > > G4 > Benchmark (digesterName) (length) Pct > MessageDigests.digest SHA3-256 64 8.3% > MessageDigests.digest SHA3-256 16384 11% > MessageDigests.digest SHA3-512 64 8.4% > MessageDigests.digest SHA3-512 16384 11.5% > MessageDigests.getAndDigest SHA3-256 64 7.2% > MessageDigests.getAndDigest SHA3-256 16384 11% > MessageDigests.getAndDigest SHA3-512 64 7.3% > MessageDigests.getAndDigest SHA3-512 16384 11.6% > > > and the version that uses the extension is ~1.8x slower than C2 > > Existing intrinsic implementation is put under a flag `UseSIMDForSHA3Intrinsic` which is on by default where the intrinsic is enabled currently. > > Sanity tests were modified to cover new intrinsic variants (`-XX:-UseSIMDForSHA3Intrinsic -XX:+-PreserveFramePointer`) on aarch64 hw. Existing test cases where intrinsic is enabled are executed with `-XX:+IgnoreUnrecognizedVMOptions -XX:+UseSIMDForSHA3Intrinsic`, on platforms where the sha3 extension is missing they still are cut off by isSHA3IntrinsicAvailable() predicate.... This pull request has now been integrated. Changeset: 23f1d4f9 Author: Dmitry Chuyko URL: https://git.openjdk.org/jdk/commit/23f1d4f9a993033596ff17751c877f2bb3f792ed Stats: 749 lines in 6 files changed: 743 ins; 0 del; 6 mod 8337666: AArch64: SHA3 GPR intrinsic Reviewed-by: aph ------------- PR: https://git.openjdk.org/jdk/pull/24260 From mdoerr at openjdk.org Thu Jun 5 14:33:51 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 5 Jun 2025 14:33:51 GMT Subject: RFR: 8358653: [s390] Clean up comments regarding frame manager In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:00 GMT, Amit Kumar wrote: > Basic comment cleanup; replaces "frame manager" by "template interpreter". Looks good and trivial. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25653#pullrequestreview-2900594327 From jsjolen at openjdk.org Thu Jun 5 14:36:50 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 5 Jun 2025 14:36:50 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> References: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> Message-ID: On Thu, 5 Jun 2025 13:00:48 GMT, David Holmes wrote: >> I like this change. It's easier to follow, at least for me. > > I think that is equivalent, but I did not want to disrupt the general control flow of the algorithm so that it still closely resembles the original fdlibm code. > > I also initially misread your suggestion and now I am wondering if I can actually simplify it to a simple two line change: > > unsigned u_k = k + n; // avoid UB signed integer overflow > k = (int) u_k; // safely assign to k > > does that bypass any UB? @dholmes-ora , if `k = (int) u_k;` does not say to the compiler that it can assume that `0 <= u_k < 2**31 - 1`, then this seems like a good changeset. Frankly, I would not trust myself in this matter, I find this very finicky. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2129013530 From cjplummer at openjdk.org Thu Jun 5 15:16:54 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 5 Jun 2025 15:16:54 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Thu, 5 Jun 2025 07:04:08 GMT, Matthias Baesken wrote: > > serviceability/sa/ClhsdbCDSCore.java explicitly says it did not create a core file > > The test passes with asan enabled on Linux ppc64le , so probably the failure seen on Linux x86_64 was more machine/environment related than ASAN related ; should I better remove the asan requires from this test ? I just think it is odd that this test failed to produce a core file, but the other core file tests passed. It's also unclear to me why serviceability/sa/ClhsdbFindPC.java failed. You didn't include enough of the .jtr file. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2944945645 From fbredberg at openjdk.org Thu Jun 5 15:26:49 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 5 Jun 2025 15:26:49 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> Message-ID: On Thu, 5 Jun 2025 13:54:49 GMT, Johan Sj?len wrote: >> I'm suprised that is even valid TBH. It strikes me as a numerical oxymoron. > >>"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two?s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ?end note ] > > So this is OK. The hard part for me is that when I see `-54u` , I need to make sure it's return type is `unsigned`, which is not immediately obvious to me. Reading what the standard say about the built-in unary minus operator I find: _"The type of the result is the type of the promoted type of expression."_ Which means that the return type of `-54u` really is `unsigned`. But I do agree with David, it looks like a numerical oxymoron. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2129118383 From cslucas at openjdk.org Thu Jun 5 16:36:00 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 5 Jun 2025 16:36:00 GMT Subject: RFR: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" [v5] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 19:58:32 GMT, Aleksey Shipilev wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert change to attribute of make_not_entrant element > > Looks good to me. Compiler folks might want to ack as well. Thank you @shipilev , @mhaessig for reviewing/testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25338#issuecomment-2945182555 From cslucas at openjdk.org Thu Jun 5 16:48:56 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 5 Jun 2025 16:48:56 GMT Subject: Integrated: 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" In-Reply-To: References: Message-ID: On Tue, 20 May 2025 22:08:18 GMT, Cesar Soares Lucas wrote: > Please review this refactor to transform the reasons for making an nmethod not entrant from `const char*` into enum values. > > Tested on Linux x64 with JTREG tier1-3 in fastdebug and release mode. This pull request has now been integrated. Changeset: 62fde687 Author: Cesar Soares Lucas URL: https://git.openjdk.org/jdk/commit/62fde687088ce72ef33b94e73babf4bfe1395c17 Stats: 115 lines in 15 files changed: 80 ins; 4 del; 31 mod 8357396: Refactor nmethod::make_not_entrant to use Enum instead of "const char*" Reviewed-by: mhaessig, shade ------------- PR: https://git.openjdk.org/jdk/pull/25338 From mdoerr at openjdk.org Thu Jun 5 17:26:17 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 5 Jun 2025 17:26:17 GMT Subject: RFR: 8334247: [PPC64] Consider trap based nmethod entry barriers [v2] In-Reply-To: References: Message-ID: > We can shrink nmethod entry barriers to 4 instructions (from 8) using conditional trap instructions. Some benchmarks seem to show very small improvements. At least the code size reduction is an advantage. Martin Doerr has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge remote-tracking branch 'origin' into 8334247_PPC64_trap_based_nmethod_entry_barrier - 8334247: [PPC64] Consider trap based nmethod entry barriers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24135/files - new: https://git.openjdk.org/jdk/pull/24135/files/67442d4b..dfc6aa5d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24135&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24135&range=00-01 Stats: 552727 lines in 7278 files changed: 228189 ins; 286910 del; 37628 mod Patch: https://git.openjdk.org/jdk/pull/24135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24135/head:pull/24135 PR: https://git.openjdk.org/jdk/pull/24135 From vlivanov at openjdk.org Thu Jun 5 18:35:57 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 5 Jun 2025 18:35:57 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v9] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 08:42:38 GMT, Kim Barrett wrote: >> Please review this change which adds a native method providing the >> implementation of Reference::get. Referece::get is an intrinsic candidate, so >> this native method implementation is only used when the intrinsic is not. >> >> Currently there is intrinsic support by the interpreter, C1, C2, and graal, >> which are always used. With this change we can later remove all the >> per-platform interpreter intrinsic implementations, and might also remove the >> C1 intrinsic implementation. >> >> Testing: >> (1) mach5 tier1-6 normal (so using all the existing intrinsics). >> (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > fix comment alignment Marked as reviewed by vlivanov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24315#pullrequestreview-2901636995 From coleenp at openjdk.org Thu Jun 5 20:40:57 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 5 Jun 2025 20:40:57 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Tue, 3 Jun 2025 07:16:47 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Moved jtreg test > - Improved documentation > - Fix coding style (asterisk placement) With a lot of reading, this looks reasonable, but I still have many questions. I'm also testing this with tier1-7. src/hotspot/share/oops/fieldInfo.cpp line 137: > 135: int position; > 136: } field_pos_t; > 137: field_pos_t* positions = nullptr; This is unused. src/hotspot/share/oops/fieldInfo.cpp line 164: > 162: r.read_field_counts(&java_fields, &injected_fields); > 163: assert(java_fields >= 0, "must be"); > 164: if (java_fields == 0 || fis->length() == 0 || static_cast(java_fields) < BinarySearchThreshold) { I don't know why you only sort Java fields and ignore the injected fields. JavaClasses::compute_offsets calls find_local_field, so might not find an injected field, I assume in the java.lang.Class (mirror). Should this sorted cache exclude classes with injected fields? ie if injected_fields > 0? If you exclude classes with injected fields, you could remove the javaClasses code (and maybe not have to re-sort any fields during dynamic dumping (?)) src/hotspot/share/oops/fieldInfo.cpp line 173: > 171: PackedTableBuilder builder(fis->length() - 1, java_fields - 1); > 172: > 173: Array* table = MetadataFactory::new_array(loader_data, java_fields * builder.element_bytes(), CHECK_NULL); Since this isn't used until you fill the table, can you move this down to just before the 'fill' call? src/hotspot/share/oops/fieldInfo.cpp line 285: > 283: FieldInfo fi; > 284: reader.read_field_info(fi); > 285: if (fi.field_flags().is_injected()) { I thought that above, you only process java fields and not the injected fields? src/hotspot/share/oops/fieldInfo.hpp line 225: > 223: // Gadget for decoding and reading the stream of field records. > 224: class FieldInfoReader { > 225: UNSIGNED5::Reader _r; I don't like this name _r but it's not your change. src/hotspot/share/oops/fieldInfo.hpp line 238: > 236: > 237: private: > 238: uint32_t next_uint() { return _r.next_uint(); } Why did you make this change and have the callers expose _r ? ------------- PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2901503494 PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2946136847 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129643750 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129856223 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129865720 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129890059 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130217134 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130238145 From coleenp at openjdk.org Thu Jun 5 20:40:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 5 Jun 2025 20:40:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v20] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Tue, 3 Jun 2025 05:51:38 GMT, Radim Vansa wrote: >> Can you explain somewhere how fields are mapped to this? I assume they're sorted, for some reason I expected the packed table to be {name-cp-index, sig-cp-index, offset-in-fieldstream-for-direct-access}. Does every field get 4 ints ? So why is it packed into ```Array``` rather than just use ```Array```? So much packing code that I don't know how anyone could ever debug it. > > Yes, in practice these all are of the same size, but in case of the masks (as well as in case of arguments in API) I want to stress out that these are 32 bit numbers. The `unsigned int`s are just 'some not too big number'. > Is there any general guidance on deciding between `unsigned int` (I suppose just `unsigned` is not recommended), `uint32_t` and `u4`? > > I was hoping that the comment on line 68 explains the intended use, but I can be more verbose and document each method. When the packed table is used for fieldinfo, it's { offset-in-fieldstream, index-in-fieldstream }. The Comparator implementation can translate offset-in-fieldstream -> { name, signature } and then do the comparison. The `index-in-fieldstream` is kind of second-class citizen; we need to fill it into `FieldInfo` and it is not encoded in the stream, therefore we need to encode it in the packed table. Reading further, I see what this mapping is and intentionally generalized. I guess a comment like, the key and value are sized to represent the maximum value for each and then compacted, or something like that. But maybe I haven't figured out the packing. Are they increments of u1, u2 or u4 or something in between? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130268911 From rvansa at openjdk.org Thu Jun 5 20:55:03 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 5 Jun 2025 20:55:03 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <4YvUnJZ6lk5sJbTScP2_oX43fcbMKWatEkfiXSFEhsM=.f8a3afd9-c896-49db-9f13-6f651cf7795c@github.com> On Thu, 5 Jun 2025 18:54:07 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Moved jtreg test >> - Improved documentation >> - Fix coding style (asterisk placement) > > src/hotspot/share/oops/fieldInfo.cpp line 164: > >> 162: r.read_field_counts(&java_fields, &injected_fields); >> 163: assert(java_fields >= 0, "must be"); >> 164: if (java_fields == 0 || fis->length() == 0 || static_cast(java_fields) < BinarySearchThreshold) { > > I don't know why you only sort Java fields and ignore the injected fields. JavaClasses::compute_offsets calls find_local_field, so might not find an injected field, I assume in the java.lang.Class (mirror). Should this sorted cache exclude classes with injected fields? ie if injected_fields > 0? > If you exclude classes with injected fields, you could remove the javaClasses code (and maybe not have to re-sort any fields during dynamic dumping (?)) I don't build a search table for injected fields because I am trying to fix performance of `InstanceKlass::find_local_field` and this uses `JavaFieldStream` - that is/was ignoring injected fields in the iteration as well. Classes with injected fields are not excluded, we just don't build the table for them. There's not lookup by name+signature, just `InstanceKlass::field(int index)` which uses iteration through `AllFieldStream`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130327017 From rvansa at openjdk.org Thu Jun 5 21:04:55 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 5 Jun 2025 21:04:55 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Thu, 5 Jun 2025 19:02:49 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Moved jtreg test >> - Improved documentation >> - Fix coding style (asterisk placement) > > src/hotspot/share/oops/fieldInfo.cpp line 285: > >> 283: FieldInfo fi; >> 284: reader.read_field_info(fi); >> 285: if (fi.field_flags().is_injected()) { > > I thought that above, you only process java fields and not the injected fields? `FieldInfoReader` is limited by the full stream, and after iterating through java fields it would start returning injected fields. For java fields we call the lookup below; we know that injected fields don't have a record in the table, and we know that there won't be any more java fields after we encounter the first injected field; that's why we `break` the cycle here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130366971 From rvansa at openjdk.org Thu Jun 5 21:10:01 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 5 Jun 2025 21:10:01 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Thu, 5 Jun 2025 20:28:13 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Moved jtreg test >> - Improved documentation >> - Fix coding style (asterisk placement) > > src/hotspot/share/oops/fieldInfo.hpp line 238: > >> 236: >> 237: private: >> 238: uint32_t next_uint() { return _r.next_uint(); } > > Why did you make this change and have the callers expose _r ? AFAIU `_r` is not exposed, it's private. My change removes `next_uint()` because it's at wrong level of abstraction: `FieldInfoReader` should expose things java/injected fields counts (and field info itself), not just some 'uint's. The encapsulation is imperfect as the methods have to be called anyway only in the correct order but to me it seemed as a way forward. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130383028 From rvansa at openjdk.org Thu Jun 5 21:25:56 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 5 Jun 2025 21:25:56 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Tue, 3 Jun 2025 07:16:47 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Moved jtreg test > - Improved documentation > - Fix coding style (asterisk placement) > Reading further, I see what this mapping is and intentionally generalized. I guess a comment like, the key and value are sized to represent the maximum value for each and then compacted, or something like that. But maybe I haven't figured out the packing. Are they increments of u1, u2 or u4 or something in between? >From the constructor that accepts the maximum number for each we figure out the number of bits required to store those numbers. Imagine that only as bits (not aligned to byte boundary... yet). Then we concatenate the bits for key and value, and then 'add' 1-7 padding zeroes (high-order bits) to align on bytes. So in the end we have each element in the table consuming 1-8 bytes (case with 0 bits for both key and value is ruled out). In case of the fields search table, the key will be at most the stream length (which in turn has at most 65536 fields and each occupies up to 8 varints which is <= 40 bytes, though normally far less) and value is at most 65535 (# fields in class) so this is 1-5 bytes. I thought that the presence of `max_key` and `max_value` in constructors along with the comment on `PackedTableBase` saying > Each element consists of **up to** 32-bit key, and **up to** 32-bit value; these are **packed into a bit-record** with 1-byte alignment. pretty much says that - shall I place extra comments somewhere else? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2946329561 From rvansa at openjdk.org Thu Jun 5 21:32:50 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 5 Jun 2025 21:32:50 GMT Subject: RFR: 8352075: Perf regression accessing fields [v22] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Removed extra global var and moved the allocation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/862b264b..14e00d0f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=20-21 Stats: 10 lines in 1 file changed: 2 ins; 8 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Thu Jun 5 21:32:51 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 5 Jun 2025 21:32:51 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Thu, 5 Jun 2025 18:04:00 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Moved jtreg test >> - Improved documentation >> - Fix coding style (asterisk placement) > > src/hotspot/share/oops/fieldInfo.cpp line 137: > >> 135: int position; >> 136: } field_pos_t; >> 137: field_pos_t* positions = nullptr; > > This is unused. Oops, copy-paste error. Thanks for spotting this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130466647 From coleenp at openjdk.org Thu Jun 5 21:44:57 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 5 Jun 2025 21:44:57 GMT Subject: RFR: 8352075: Perf regression accessing fields [v22] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Thu, 5 Jun 2025 21:32:50 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Removed extra global var and moved the allocation Your explanation above really helps. Can you add that to above the PackedTableBase constructor? It's easier to understand in prose than reading this constructor even though it's short. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2946409749 From coleenp at openjdk.org Thu Jun 5 22:16:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 5 Jun 2025 22:16:54 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <-fNueN0qDKvh7hmAhaqWAvutzLWlPzu6O0gWgQSFVkY=.be36c9ac-671d-4e81-86d1-03b46fc52785@github.com> On Thu, 5 Jun 2025 05:17:48 GMT, Ioi Lam wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Moved jtreg test >> - Improved documentation >> - Fix coding style (asterisk placement) > > I have written a POC that shows that the table must be sorted again when dumping a dynamic CDS archive. See https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f > > Explanations are in [here](https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f#diff-fd7608607ecf305bb3535b500bff5d53ec216d2da25e3bad1a1d699f56b09283R199) > > I will create an RFE for the JDK mainline that adds built-in debugging support for the `(oldSym > newSym_orig)` condition as describe in the POC. Please wait for that before integrating this PR. I can help you write the code for re-sorting the tables. As @iklam was saying in above, the table won't work for dynamic dumping for CDS and fails all these tests. Array* FieldInfoStream::create_search_table(ConstantPool* cp, const Array* fis, ClassLoaderData* loader_data, TRAPS) { + if (CDSConfig::is_dumping_dynamic_archive()) { + // We cannot call validate_search_table. The _fieldinfo_search_table should be sorted by "requested" addresses, + // but validate_search_table will be getting Symbol* addresses from _constants, which has "buffered" addresses. + // + // For background, see new comments inside allocate_node_impl in symbolTable.cpp + return nullptr; + } + This fixes these tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2946527410 From iklam at openjdk.org Thu Jun 5 23:14:04 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 5 Jun 2025 23:14:04 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <0GF59UrB4MHOBgk7V0Ew5R5ekYI4RroXDvxreSu23b8=.b278fd07-3b86-41e6-a05c-a87d6396f841@github.com> On Thu, 5 Jun 2025 05:17:48 GMT, Ioi Lam wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Moved jtreg test >> - Improved documentation >> - Fix coding style (asterisk placement) > > I have written a POC that shows that the table must be sorted again when dumping a dynamic CDS archive. See https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f > > Explanations are in [here](https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f#diff-fd7608607ecf305bb3535b500bff5d53ec216d2da25e3bad1a1d699f56b09283R199) > > I will create an RFE for the JDK mainline that adds built-in debugging support for the `(oldSym > newSym_orig)` condition as describe in the POC. Please wait for that before integrating this PR. I can help you write the code for re-sorting the tables. > As @iklam was saying in above, the table won't work for dynamic dumping for CDS and fails all these tests. > > ``` > Array* FieldInfoStream::create_search_table(ConstantPool* cp, const Array* fis, ClassLoaderData* loader_data, TRAPS) { > + if (CDSConfig::is_dumping_dynamic_archive()) { > + // We cannot call validate_search_table. The _fieldinfo_search_table should be sorted by "requested" addresses, > + // but validate_search_table will be getting Symbol* addresses from _constants, which has "buffered" addresses. > + // > + // For background, see new comments inside allocate_node_impl in symbolTable.cpp > + return nullptr; > + } > + > ``` > > This fixes these tests. This fix looks good to me. It also obviates the need to re-sort the table for dynamic CDS dump. I am OK with this. We can implement re-sorting for the dynamic CDS archives in a separate RFE if desired. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2946819592 From amitkumar at openjdk.org Fri Jun 6 03:53:02 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 6 Jun 2025 03:53:02 GMT Subject: RFR: 8358653: [s390] Clean up comments regarding frame manager In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:00 GMT, Amit Kumar wrote: > Basic comment cleanup; replaces "frame manager" by "template interpreter". Thanks for the review Martin. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25653#issuecomment-2947995094 From amitkumar at openjdk.org Fri Jun 6 03:53:02 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 6 Jun 2025 03:53:02 GMT Subject: Integrated: 8358653: [s390] Clean up comments regarding frame manager In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:10:00 GMT, Amit Kumar wrote: > Basic comment cleanup; replaces "frame manager" by "template interpreter". This pull request has now been integrated. Changeset: 28acca60 Author: Amit Kumar URL: https://git.openjdk.org/jdk/commit/28acca609bbb8ade0af88b536c8c88b7fa43849a Stats: 15 lines in 6 files changed: 0 ins; 0 del; 15 mod 8358653: [s390] Clean up comments regarding frame manager Reviewed-by: mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/25653 From kbarrett at openjdk.org Fri Jun 6 06:00:48 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 6 Jun 2025 06:00:48 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v10] In-Reply-To: References: Message-ID: > Please review this change which adds a native method providing the > implementation of Reference::get. Referece::get is an intrinsic candidate, so > this native method implementation is only used when the intrinsic is not. > > Currently there is intrinsic support by the interpreter, C1, C2, and graal, > which are always used. With this change we can later remove all the > per-platform interpreter intrinsic implementations, and might also remove the > C1 intrinsic implementation. > > Testing: > (1) mach5 tier1-6 normal (so using all the existing intrinsics). > (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: - add pseudo-native entry for Reference.get0 - tidy CallGenerator lookup in Compile ctor ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24315/files - new: https://git.openjdk.org/jdk/pull/24315/files/edd4dec2..46ba079f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=08-09 Stats: 6 lines in 2 files changed: 1 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24315.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24315/head:pull/24315 PR: https://git.openjdk.org/jdk/pull/24315 From kbarrett at openjdk.org Fri Jun 6 06:00:49 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 6 Jun 2025 06:00:49 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v10] In-Reply-To: References: Message-ID: <6VlOZeW27xLro_Q5JSZ-JdbqqAKovTTkWKlW6Vij-qk=.4aa5ec6c-5ec9-4bb5-9023-417df122e411@github.com> On Fri, 6 Jun 2025 05:57:16 GMT, Kim Barrett wrote: >> Please review this change which adds a native method providing the >> implementation of Reference::get. Referece::get is an intrinsic candidate, so >> this native method implementation is only used when the intrinsic is not. >> >> Currently there is intrinsic support by the interpreter, C1, C2, and graal, >> which are always used. With this change we can later remove all the >> per-platform interpreter intrinsic implementations, and might also remove the >> C1 intrinsic implementation. >> >> Testing: >> (1) mach5 tier1-6 normal (so using all the existing intrinsics). >> (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. > > Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: > > - add pseudo-native entry for Reference.get0 > - tidy CallGenerator lookup in Compile ctor src/hotspot/share/interpreter/templateInterpreterGenerator.cpp line 231: > 229: // intrinsic is disabled. > 230: native_method_entry(java_lang_Thread_currentThread) > 231: native_method_entry(java_lang_ref_reference_get0) It turned out there was a bug lurking in the change to move the intrinsic to Reference::get0. I had tested it with the interpreter intrinsic made inoperative, but nearly forgot to test the normal case. It turned out that if the interpreter intrinsic was operational but disabled then the interpreter would hit an assert "tried to execute native method as non-native". This line is the fix for that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24315#discussion_r2131558170 From kbarrett at openjdk.org Fri Jun 6 06:02:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 6 Jun 2025 06:02:59 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v10] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 17:39:55 GMT, Chen Liang wrote: >> Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: >> >> - add pseudo-native entry for Reference.get0 >> - tidy CallGenerator lookup in Compile ctor > > src/hotspot/share/opto/compile.cpp line 786: > >> 784: initial_gvn()->set_type_bottom(s); >> 785: verify_start(s); >> 786: if (method()->intrinsic_id() == vmIntrinsics::_Reference_get) { > > Should we remove this now or as part of the redundant intrinsic cleanup for interpreter and c1? I see the interpreter is now kept intact. I'm planning to remove the interpreter intrinsic as a followup. I couldn't figure out how to hit this after making Reference::get0 the intrinsic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24315#discussion_r2131561433 From mbaesken at openjdk.org Fri Jun 6 06:44:59 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 6 Jun 2025 06:44:59 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Thu, 5 Jun 2025 15:14:05 GMT, Chris Plummer wrote: > I just think it is odd that this test failed to produce a core file, but the other core file tests passed. I don't think they all passed, but can't find a log of the old run ; most likely it was machine specific. I will remove it from this PR because it is not sure at all that it is ASAN related . ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2948259791 From rvansa at openjdk.org Fri Jun 6 07:14:21 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Fri, 6 Jun 2025 07:14:21 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: - Add more comments - Disable search table with dynamic CDS ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/14e00d0f..d75d6240 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=21-22 Stats: 16 lines in 2 files changed: 16 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Fri Jun 6 07:19:04 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Fri, 6 Jun 2025 07:19:04 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: <0GF59UrB4MHOBgk7V0Ew5R5ekYI4RroXDvxreSu23b8=.b278fd07-3b86-41e6-a05c-a87d6396f841@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <0GF59UrB4MHOBgk7V0Ew5R5ekYI4RroXDvxreSu23b8=.b278fd07-3b86-41e6-a05c-a87d6396f841@github.com> Message-ID: On Thu, 5 Jun 2025 23:11:21 GMT, Ioi Lam wrote: >> I have written a POC that shows that the table must be sorted again when dumping a dynamic CDS archive. See https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f >> >> Explanations are in [here](https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f#diff-fd7608607ecf305bb3535b500bff5d53ec216d2da25e3bad1a1d699f56b09283R199) >> >> I will create an RFE for the JDK mainline that adds built-in debugging support for the `(oldSym > newSym_orig)` condition as describe in the POC. Please wait for that before integrating this PR. I can help you write the code for re-sorting the tables. > >> As @iklam was saying in above, the table won't work for dynamic dumping for CDS and fails all these tests. >> >> ``` >> Array* FieldInfoStream::create_search_table(ConstantPool* cp, const Array* fis, ClassLoaderData* loader_data, TRAPS) { >> + if (CDSConfig::is_dumping_dynamic_archive()) { >> + // We cannot call validate_search_table. The _fieldinfo_search_table should be sorted by "requested" addresses, >> + // but validate_search_table will be getting Symbol* addresses from _constants, which has "buffered" addresses. >> + // >> + // For background, see new comments inside allocate_node_impl in symbolTable.cpp >> + return nullptr; >> + } >> + >> ``` >> >> This fixes these tests. > > This fix looks good to me. It also obviates the need to re-sort the table for dynamic CDS dump. I am OK with this. We can implement re-sorting for the dynamic CDS archives in a separate RFE if desired. Great, I thought that @iklam requested to add something more substantial (and I didn't know that there are existing tests that would fail). Any re-sorting with dynamic CDS can be added as a follow-up. So do you think we could still squeeze this in before the rampdown? (I'll type /integrate just in case, the timezones create some lag). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2948325055 From mbaesken at openjdk.org Fri Jun 6 08:23:52 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 6 Jun 2025 08:23:52 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v3] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Thu, 5 Jun 2025 06:58:33 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > AOTCodeCompressedOopsTest will be handled separately build/AbsPathsInImage.java fails with ASAN because it reports a lot of 'unwanted' paths in the binaries java.lang.Exception: Test failed at AbsPathsInImage.main(AbsPathsInImage.java:122) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1474) but the test is already skipped for debug builds , so we can skip it for ASAN too, we most likely don't want to deliver asan-enabled builds to customers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2948471473 From aph at openjdk.org Fri Jun 6 09:49:58 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 6 Jun 2025 09:49:58 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 05:16:25 GMT, Liming Liu wrote: > According to perf, post-increment ops help to reduce the access to TLB on Ampere1 in this case. Hmm, but it's code in a rather odd style in shared code. And from what I see, the intrinsic is only 22% of the runtime (for 128 bytes) anyway, and you're making the code larger. I certainly don't want to see this sort of thing proliferating in the intrinsics. In general, it's up to CPU designers to make simple, straightforward code work well. How important is this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2131885760 From aph at openjdk.org Fri Jun 6 09:49:58 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 6 Jun 2025 09:49:58 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: References: Message-ID: On Fri, 6 Jun 2025 09:44:54 GMT, Andrew Haley wrote: >> According to perf, post-increment ops help to reduce the access to TLB on Ampere1 in this case. > >> According to perf, post-increment ops help to reduce the access to TLB on Ampere1 in this case. > > Hmm, but it's code in a rather odd style in shared code. And from what I see, the intrinsic is only 22% of the runtime (for 128 bytes) anyway, and you're making the code larger. I certainly don't want to see this sort of thing proliferating in the intrinsics. > > In general, it's up to CPU designers to make simple, straightforward code work well. > > How important is this? On the other hand this code already exists in CRC32C, so it's simply unifying the two routines. OK, I won't object. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2131889129 From coleenp at openjdk.org Fri Jun 6 11:03:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 11:03:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS To integrate hotspot changes, you need two reviewers and people 'requesting changes' to withdraw their requests. Thank goodness the bots prevented this from being integrated. You need to wait for all the comments to be resolved. This is a P3 bug so you have more time to get this integrated for JDK 25. I posted the schedule in the issue. The process is that this change would be integrated into the main repository (destined for JDK 26 and then slash-backported to JDK 25 a couple days later if testing is clean). My tier 1-7 testing passes with the dynamic CDS patch above. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2948894141 From rvansa at openjdk.org Fri Jun 6 11:15:59 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Fri, 6 Jun 2025 11:15:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS I am not a committer, so I didn't expect any action to be taken until this gets all the approvals and someone sponsors this. (and I've misremembered the date, thinking it's today). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2948915602 From rvansa at openjdk.org Fri Jun 6 11:16:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Fri, 6 Jun 2025 11:16:00 GMT Subject: RFR: 8352075: Perf regression accessing fields [v7] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <0ukRXRaojw8KZXGYN_IpikqnA1kNsMoCWFE-0fOBHjk=.bda5ca70-ed55-4a27-a018-c89331415080@github.com> Message-ID: On Wed, 21 May 2025 15:00:07 GMT, Chris Plummer wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix typo > > It looks like you removed the SA changes, so I'm not so sure you still need a review from me. I just ask that you make sure the tests in serviceability/sa and sun/tools/jhsdb all pass. I'm about to be OOO for a week, so I won't be able to responds again until then. @plummercj @rose00 Could I ask you to withdraw the request for changes / clarify further ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2948919121 From mbaesken at openjdk.org Fri Jun 6 11:27:31 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 6 Jun 2025 11:27:31 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v4] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <04ztddbwMOlo-dq6TQ4_jPWfj-wNhsyDNWQSsohFo-E=.98d51cab-3302-41fd-b214-dbb73f4f6e0d@github.com> > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: remove the asan disabling from 2 sa tests; they work with asan too on some machines at least ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/8b9e3dde..7ce447b3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=02-03 Stats: 5 lines in 2 files changed: 0 ins; 3 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Fri Jun 6 11:34:04 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 6 Jun 2025 11:34:04 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v5] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: exclude AbsPathInImage test from asan ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/7ce447b3..f8458f10 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=03-04 Stats: 4 lines in 2 files changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Fri Jun 6 11:34:04 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 6 Jun 2025 11:34:04 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v4] In-Reply-To: <04ztddbwMOlo-dq6TQ4_jPWfj-wNhsyDNWQSsohFo-E=.98d51cab-3302-41fd-b214-dbb73f4f6e0d@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <04ztddbwMOlo-dq6TQ4_jPWfj-wNhsyDNWQSsohFo-E=.98d51cab-3302-41fd-b214-dbb73f4f6e0d@github.com> Message-ID: On Fri, 6 Jun 2025 11:27:31 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > remove the asan disabling from 2 sa tests; they work with asan too on some machines at least For the now remaining tests, are you okay with the explanation? Should I add it as comment to the test files, or let's simply keep it here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2948963076 From coleenp at openjdk.org Fri Jun 6 12:06:28 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 12:06:28 GMT Subject: RFR: 8358326: Use oopFactory array allocation Message-ID: This patch removes cases of direct calls to {type,obj}ArrayKlass->allocate() and calls oopFactory::new_*array instead. It also renames {type,obj}ArrayKlass->allocate functions to allocate_klass and allocate_instance so it's more clear which allocation it's doing and to match InstanceKlass allocate functions, and makes these functions private with friends for Deoptimization and oopFactory. For JEP 401, arrays are being extended to support new formats and attributes and this reduces the call sites. Tested with tier1-7. ------------- Commit messages: - 8358326: Use oopFactory array allocation Changes: https://git.openjdk.org/jdk/pull/25590/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25590&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358326 Stats: 66 lines in 10 files changed: 23 ins; 14 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/25590.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25590/head:pull/25590 PR: https://git.openjdk.org/jdk/pull/25590 From iklam at openjdk.org Fri Jun 6 16:17:01 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 6 Jun 2025 16:17:01 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS I don't think there's a rush to get this into JDK 25, especially this PR adds a lot of complexity into the field lookup code. I have to say honestly that I can't understand how it works without tracing in gdb. While the benefit can be seen in a synthetic benchmark, do we have any data that shows a benefit in real world applications? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2949768200 From cjplummer at openjdk.org Fri Jun 6 16:21:55 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 6 Jun 2025 16:21:55 GMT Subject: RFR: 8352075: Perf regression accessing fields [v7] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <0ukRXRaojw8KZXGYN_IpikqnA1kNsMoCWFE-0fOBHjk=.bda5ca70-ed55-4a27-a018-c89331415080@github.com> Message-ID: On Wed, 21 May 2025 15:00:07 GMT, Chris Plummer wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix typo > > It looks like you removed the SA changes, so I'm not so sure you still need a review from me. I just ask that you make sure the tests in serviceability/sa and sun/tools/jhsdb all pass. I'm about to be OOO for a week, so I won't be able to responds again until then. > @plummercj Could I ask you to withdraw the request for changes / clarify further ? As I mentioned earlier: > It looks like you removed the SA changes, so I'm not so sure you still need a review from me. I just ask that you make sure the tests in serviceability/sa and sun/tools/jhsdb all pass. This still seems to be the case. Also, I don't think there is any formal way to withdraw a request for changes. Maybe I could mark it as reviewed, but that would be misleading since after you removed the SA changes there would not actually be any code in this PR that I would be reviewing, so I wouldn't want to be listed as a reviewer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2949783839 From iklam at openjdk.org Fri Jun 6 16:31:55 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 6 Jun 2025 16:31:55 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS To aid debugging, please add the FieldInfoStream::print_search_table() code (called from InstanceKlass::print_on()) that I included in https://github.com/iklam/jdk/commit/dcd53ebaeab7b38be02aa5b896ce9e449a45418f ------------- Changes requested by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2905429817 From coleenp at openjdk.org Fri Jun 6 16:45:02 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 16:45:02 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS I have a few more comments and questions. src/hotspot/share/oops/fieldInfo.inline.hpp line 126: > 124: fi._offset = _r.next_uint(); > 125: fi._access_flags = AccessFlags(checked_cast(_r.next_uint())); > 126: fi._field_flags = FieldInfo::FieldFlags(_r.next_uint()); These callers don't need to know about _r even though they're in FieldInfoStream. And then you don't need to make these changes. test/hotspot/jtreg/runtime/FieldStream/LocalFieldLookupTest.java line 30: > 28: import static org.objectweb.asm.ClassWriter.COMPUTE_FRAMES; > 29: import static org.objectweb.asm.ClassWriter.COMPUTE_MAXS; > 30: import static org.objectweb.asm.Opcodes.*; Sorry I just noticed this. Is it possible to write this using the Classfile API? I think the Classfile API is available in JDK 21 (checking) https://bugs.openjdk.org/browse/JDK-8294982 ------------- PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2905191098 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132434505 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132519889 From coleenp at openjdk.org Fri Jun 6 16:45:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 16:45:03 GMT Subject: RFR: 8352075: Perf regression accessing fields [v21] In-Reply-To: <4YvUnJZ6lk5sJbTScP2_oX43fcbMKWatEkfiXSFEhsM=.f8a3afd9-c896-49db-9f13-6f651cf7795c@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <4YvUnJZ6lk5sJbTScP2_oX43fcbMKWatEkfiXSFEhsM=.f8a3afd9-c896-49db-9f13-6f651cf7795c@github.com> Message-ID: On Thu, 5 Jun 2025 20:52:10 GMT, Radim Vansa wrote: >> src/hotspot/share/oops/fieldInfo.cpp line 164: >> >>> 162: r.read_field_counts(&java_fields, &injected_fields); >>> 163: assert(java_fields >= 0, "must be"); >>> 164: if (java_fields == 0 || fis->length() == 0 || static_cast(java_fields) < BinarySearchThreshold) { >> >> I don't know why you only sort Java fields and ignore the injected fields. JavaClasses::compute_offsets calls find_local_field, so might not find an injected field, I assume in the java.lang.Class (mirror). Should this sorted cache exclude classes with injected fields? ie if injected_fields > 0? >> If you exclude classes with injected fields, you could remove the javaClasses code (and maybe not have to re-sort any fields during dynamic dumping (?)) > > I don't build a search table for injected fields because I am trying to fix performance of `InstanceKlass::find_local_field` and this uses `JavaFieldStream` - that is/was ignoring injected fields in the iteration as well. > Classes with injected fields are not excluded, we just don't build the table for them. There's not lookup by name+signature, just `InstanceKlass::field(int index)` which uses iteration through `AllFieldStream`. Okay the code I looked at in JavaClasses calls AllFieldStream for injected fields. >> src/hotspot/share/oops/fieldInfo.cpp line 285: >> >>> 283: FieldInfo fi; >>> 284: reader.read_field_info(fi); >>> 285: if (fi.field_flags().is_injected()) { >> >> I thought that above, you only process java fields and not the injected fields? > > `FieldInfoReader` is limited by the full stream, and after iterating through java fields it would start returning injected fields. For java fields we call the lookup below; we know that injected fields don't have a record in the table, and we know that there won't be any more java fields after we encounter the first injected field; that's why we `break` the cycle here. Okay, this does have a comment about this assumption. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132355744 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132361474 From coleenp at openjdk.org Fri Jun 6 16:45:04 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 16:45:04 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Thu, 5 Jun 2025 21:06:44 GMT, Radim Vansa wrote: >> src/hotspot/share/oops/fieldInfo.hpp line 238: >> >>> 236: >>> 237: private: >>> 238: uint32_t next_uint() { return _r.next_uint(); } >> >> Why did you make this change and have the callers expose _r ? > > AFAIU `_r` is not exposed, it's private. My change removes `next_uint()` because it's at wrong level of abstraction: `FieldInfoReader` should expose things java/injected fields counts (and field info itself), not just some 'uint's. > The encapsulation is imperfect as the methods have to be called anyway only in the correct order but to me it seemed as a way forward. Okay I see why you made this change. Removing the friends was good because it removes other callers that shouldn't call next_uint(), but the callers within FieldInfoStream now have to call _r.next_uint() which is extra exposure to the name _r, that could be a lot more descriptive. I don't see what's wrong with FieldInfoStream calling next_uint() a private method that has access to the Unsigned5 stream. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132424315 From coleenp at openjdk.org Fri Jun 6 16:45:05 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 16:45:05 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 15:44:28 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add more comments >> - Disable search table with dynamic CDS > > src/hotspot/share/oops/fieldInfo.inline.hpp line 126: > >> 124: fi._offset = _r.next_uint(); >> 125: fi._access_flags = AccessFlags(checked_cast(_r.next_uint())); >> 126: fi._field_flags = FieldInfo::FieldFlags(_r.next_uint()); > > These callers don't need to know about _r even though they're in FieldInfoStream. And then you don't need to make these changes. The addition of read_name_and_signature() is a good level of abstraction. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132437537 From coleenp at openjdk.org Fri Jun 6 16:45:05 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 16:45:05 GMT Subject: RFR: 8352075: Perf regression accessing fields [v20] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Thu, 5 Jun 2025 20:37:22 GMT, Coleen Phillimore wrote: >> Yes, in practice these all are of the same size, but in case of the masks (as well as in case of arguments in API) I want to stress out that these are 32 bit numbers. The `unsigned int`s are just 'some not too big number'. >> Is there any general guidance on deciding between `unsigned int` (I suppose just `unsigned` is not recommended), `uint32_t` and `u4`? >> >> I was hoping that the comment on line 68 explains the intended use, but I can be more verbose and document each method. When the packed table is used for fieldinfo, it's { offset-in-fieldstream, index-in-fieldstream }. The Comparator implementation can translate offset-in-fieldstream -> { name, signature } and then do the comparison. The `index-in-fieldstream` is kind of second-class citizen; we need to fill it into `FieldInfo` and it is not encoded in the stream, therefore we need to encode it in the packed table. > > Reading further, I see what this mapping is and intentionally generalized. I guess a comment like, the key and value are sized to represent the maximum value for each and then compacted, or something like that. But maybe I haven't figured out the packing. Are they increments of u1, u2 or u4 or something in between? Yes this is helpful, but could you move this to the implementation in the cpp file? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132440496 From coleenp at openjdk.org Fri Jun 6 17:27:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 17:27:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 16:41:25 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add more comments >> - Disable search table with dynamic CDS > > test/hotspot/jtreg/runtime/FieldStream/LocalFieldLookupTest.java line 30: > >> 28: import static org.objectweb.asm.ClassWriter.COMPUTE_FRAMES; >> 29: import static org.objectweb.asm.ClassWriter.COMPUTE_MAXS; >> 30: import static org.objectweb.asm.Opcodes.*; > > Sorry I just noticed this. Is it possible to write this using the Classfile API? I think the Classfile API is available in JDK 21 (checking) https://bugs.openjdk.org/browse/JDK-8294982 No, never mind this. The Classfile API won't be backportable to JDK 21. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132585135 From cjplummer at openjdk.org Fri Jun 6 18:51:54 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 6 Jun 2025 18:51:54 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v4] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <04ztddbwMOlo-dq6TQ4_jPWfj-wNhsyDNWQSsohFo-E=.98d51cab-3302-41fd-b214-dbb73f4f6e0d@github.com> Message-ID: On Fri, 6 Jun 2025 11:31:36 GMT, Matthias Baesken wrote: > For the now remaining tests, are you okay with the explanation? Should I add it as comment to the test files, or let's simply keep it here? I like @dholmes-ora idea of adding an `@comment`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2950180918 From iklam at openjdk.org Fri Jun 6 18:54:05 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 6 Jun 2025 18:54:05 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS src/hotspot/share/oops/fieldInfo.cpp line 137: > 135: int index; > 136: int position; > 137: } field_pos_t; The naming of the type and fields is not consistent with HotSpot conventions. Suggestion: struct FieldPosition { Symbol* _name; Symbol* _signature; int _index; int _position; }; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132711029 From vlivanov at openjdk.org Fri Jun 6 18:55:57 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 6 Jun 2025 18:55:57 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v10] In-Reply-To: References: Message-ID: On Fri, 6 Jun 2025 06:00:48 GMT, Kim Barrett wrote: >> Please review this change which adds a native method providing the >> implementation of Reference::get. Referece::get is an intrinsic candidate, so >> this native method implementation is only used when the intrinsic is not. >> >> Currently there is intrinsic support by the interpreter, C1, C2, and graal, >> which are always used. With this change we can later remove all the >> per-platform interpreter intrinsic implementations, and might also remove the >> C1 intrinsic implementation. >> >> Testing: >> (1) mach5 tier1-6 normal (so using all the existing intrinsics). >> (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. > > Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: > > - add pseudo-native entry for Reference.get0 > - tidy CallGenerator lookup in Compile ctor Marked as reviewed by vlivanov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24315#pullrequestreview-2905801629 From mbaesken at openjdk.org Fri Jun 6 19:23:55 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 6 Jun 2025 19:23:55 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Tue, 3 Jun 2025 00:51:04 GMT, David Holmes wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> TestBreakSignalThreadDump has issues with asan > > Changes look fine but I agree with Chris that we need to document why these tests don't work with ASAN, though I think I'd prefer to see an `@comment` before the `@requires !vm.asan` in the actual test files - assuming the reason can be stated clearly and succinctly. > I like @dholmes-ora idea of adding an @comment. Why not, I am fine with this ! Do you think the current comment suggestions are okay ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2950403319 From coleenp at openjdk.org Fri Jun 6 19:25:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 6 Jun 2025 19:25:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS I have a couple more comments for today. src/hotspot/share/utilities/packedTable.cpp line 49: > 47: assert((key & ~_key_mask) == 0, "key out of bounds"); > 48: assert((value & ~_value_mask) == 0, "value out of bounds: %x vs. %x (%x)", value, _value_mask, ~_value_mask); > 49: *reinterpret_cast(data + offset) = static_cast(key) | (static_cast(value) << _value_shift); How does this line not get a signal for unaligned write? src/hotspot/share/utilities/packedTable.cpp line 83: > 81: assert(mid >= low && mid < high, "integer overflow?"); > 82: uint64_t element = read_element(data, length, _element_bytes * mid); > 83: uint32_t key = element & _key_mask; All this casting is hard to follow so I added this at the beginning of the file: #ifndef _WIN32 #pragma GCC diagnostic warning "-Wconversion" #endif and this line, 102, and 87 complain: warning: conversion from 'uint64_t' {aka 'long unsigned int'} to 'uint32_t' {aka 'unsigned int'} may change value [-Wconversion] 87 | uint32_t key = element & _key_mask; | ~~~~~~~~^~~~~~~~~~~ If the value is okay to cast to uint32_t, which I believe it is, use checked_cast from checkedCast.hpp. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2905861595 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132749485 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2132790669 From stuefe at openjdk.org Fri Jun 6 19:54:53 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 6 Jun 2025 19:54:53 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v5] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Fri, 6 Jun 2025 11:34:04 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > exclude AbsPathInImage test from asan Like @dholmes-ora , I would appreciate a short comment explaining the reasons for these exclusions. If those are bugs that can be fixed, its better to have a JBS issue describing the problem in order to evenutally fix it, and possibly problemlist the test (for asan) in the meantime. Note that even if you don't know the exact reasons why e.g. libjsig + asan fails, and have no time to investigate it, its okay to make that a JBS task as in "Investigate whether libjsig tests can work with asan". Those are good starter issues for folks with a bit of time at their hands. test/hotspot/jtreg/runtime/XCheckJniJsig/XCheckJSig.java line 32: > 30: * java.management > 31: * @requires os.family == "linux" | os.family == "mac" > 32: * @requires !vm.asan I would like to understand this. What part of asan interferes with the libjsig? Asan interposes memory APIs, libjsig interposes signal functions, both should be able to coexist. If its a bug, we should have a JBS issue to (eventually) fix it, and problemlist the test for asan in the meantime. Otherwise, a short comment explaining the issue would be good. test/hotspot/jtreg/serviceability/dcmd/vm/SystemDumpMapTest.java line 39: > 37: * @library /test/lib > 38: * @requires (os.family == "linux" | os.family == "windows" | os.family == "mac") > 39: * @requires !vm.asan If this is easy to fix, e.g. like Jiangli did for the static build, I would rather have a fix inside the test than to completely disable it. Otherwise, a short comment would be good explaining why the test cannot run with asan. Preexisting: I have no clue why this is excluded for riscv quemu, but SystemMapTest is not, both tests are functionally almost identical. ------------- PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2905989713 PR Review Comment: https://git.openjdk.org/jdk/pull/25575#discussion_r2132815020 PR Review Comment: https://git.openjdk.org/jdk/pull/25575#discussion_r2132822857 From iklam at openjdk.org Fri Jun 6 20:57:59 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 6 Jun 2025 20:57:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 07:14:21 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Add more comments > - Disable search table with dynamic CDS I found the `PackedTableLookup` needing too many type casts. Also, there's inconsistent naming for `table` and `data`, `table_length` vs `length`. We should use `table` and `table_table`. There's no need to pass the table around in the APIs. It should be set once in the constructor. Suggested fix: https://github.com/iklam/jdk/commit/4e51ce86f296c5bd3b50ec4636db0dea991ca869 BTW, I concur with Coleen that this will trigger unaligned access traps (on some platforms, SPARC for sure), so it should be removed). uint64_t PackedTableLookup::read_element(size_t offset) const {Add commentMore actions if (offset + sizeof(uint64_t) <= _table_length) { return *reinterpret_cast(_table + offset); ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2950773472 From cjplummer at openjdk.org Fri Jun 6 21:12:53 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 6 Jun 2025 21:12:53 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v2] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Tue, 3 Jun 2025 00:51:04 GMT, David Holmes wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> TestBreakSignalThreadDump has issues with asan > > Changes look fine but I agree with Chris that we need to document why these tests don't work with ASAN, though I think I'd prefer to see an `@comment` before the `@requires !vm.asan` in the actual test files - assuming the reason can be stated clearly and succinctly. > > I like @dholmes-ora idea of adding an @comment. > > Why not, I am fine with this ! Do you think the current comment suggestions are okay ? If you are talking about the 1-line reasons given in the comment above, I'm fine with that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2950805942 From lucy at openjdk.org Sat Jun 7 19:13:54 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Sat, 7 Jun 2025 19:13:54 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v5] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Fri, 6 Jun 2025 11:34:04 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > exclude AbsPathInImage test from asan Changes look good technically. No add'l requirements from my side. As already discussed, some reasoning why a particular test does not work with asan is more than desirable - it's required. As @plummercj said, @comments are ok if the reason can be condensed into a one-liner. I'll approve once the requests from others are honored. ------------- Changes requested by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2907694852 From kbarrett at openjdk.org Sun Jun 8 18:19:01 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 8 Jun 2025 18:19:01 GMT Subject: RFR: 8352140: UBSAN: fix the left shift of negative value in klass.hpp, array_layout_helper() [v3] In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 10:22:34 GMT, Kim Barrett wrote: >> Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/master' into _8352140_lshift_klass_hpp >> - minimum change. >> - Merge remote-tracking branch 'origin/master' into _8352140_lshift_klass_hpp >> - 8352140: UBSAN: fix the left shift of negative value in klass.hpp, array_layout_helper() > > Changes requested by kbarrett (Reviewer). > I made the `int->uint32` change at the minimum level, because the layout_helper constants are `public` in the `Kalss` class and are used all around Hotspot code, particularly in architecture dependent codes. Changing them to unsigned breaks the build (at SYMBOL generation phase). In addition, the two most-significant bits of the layout are used for `array` types and `lh < 0` (or more precisely `lh < _lh_neutral_value`) is used in C++ and assembly language of arch-dep code. @kimbarrett , can we limit the change here to `array_layout_helper` or should we proceed to use `uint32_t` instead? I'm inclined against uglifying the code for a ubsan issue just to keep the change small. I'd rather we did the "right" fix instead, even though there's some fannout. I don't know how urgent we consider fixing the ubsan issue though. But if we're going to do a minimal change, I like the @dean-long suggestion here: https://github.com/openjdk/jdk/pull/24184/files#r2090835514 The latest proposal also doesn't do anything about the issue I mentioned at the end of this comment: https://github.com/openjdk/jdk/pull/24184#discussion_r2055739014 where there's a related problem with layout_helper_boolean_diffbit. That could be treated as a separate bug, in which case just file it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24184#issuecomment-2954209375 From dholmes at openjdk.org Mon Jun 9 03:37:58 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 9 Jun 2025 03:37:58 GMT Subject: RFR: 8284017: Improve handshake filtering mechanism In-Reply-To: References: Message-ID: On Wed, 28 May 2025 15:10:09 GMT, Anton Artemov wrote: > Hi, please consider the following enhancement: > > In this PR a new way of supplying multiple arguments to filter out / skip operations in handshake/safepoint poll is given. Multiple boolean arguments are combined in a hash table, where keys are taken from a new enum `HandshakeOperationProperty`, which is to be modified when there is a need for a new argument. > > Tested in GHA and tiers 1 - 3. This seems rather heavyweight - having to create a ResourceHashtable for every call site, dynamically - and quite cumbersome to write out at the call-site versus the simple boolean args. Do we have candidates for expanding the current set of "filters"? Two flags is quite manageable. Three is a stretch but still okay if we can take advantage of default args. Four or more would definitely cry out for some better mechanism, but I'm not sure this is it. Sorry. ------------- PR Review: https://git.openjdk.org/jdk/pull/25497#pullrequestreview-2908845333 From lliu at openjdk.org Mon Jun 9 05:16:57 2025 From: lliu at openjdk.org (Liming Liu) Date: Mon, 9 Jun 2025 05:16:57 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: References: Message-ID: <5LnngODhG88eM8S-Umh4b_nD4fi45IdeMZp2_mu8788=.9a61d55a-eb6f-452f-8bb4-a0522d23c15f@github.com> On Fri, 6 Jun 2025 09:47:17 GMT, Andrew Haley wrote: >>> According to perf, post-increment ops help to reduce the access to TLB on Ampere1 in this case. >> >> Hmm, but it's code in a rather odd style in shared code. And from what I see, the intrinsic is only 22% of the runtime (for 128 bytes) anyway, and you're making the code larger. I certainly don't want to see this sort of thing proliferating in the intrinsics. >> >> In general, it's up to CPU designers to make simple, straightforward code work well. >> >> How important is this? > > On the other hand this code already exists in CRC32C, so it's simply unifying the two routines. OK, I won't object. > you're making the code larger. I don't think this makes the code larger. > How important is this? As I mentioned in problem 1, this causes a regression (~-14%) on Ampere1 when handling 64 bytes. No obvious effects in other cases though. > so it's simply unifying the two routines. Yes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2135041760 From rvansa at openjdk.org Mon Jun 9 06:43:59 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 06:43:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 15:46:31 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/fieldInfo.inline.hpp line 126: >> >>> 124: fi._offset = _r.next_uint(); >>> 125: fi._access_flags = AccessFlags(checked_cast(_r.next_uint())); >>> 126: fi._field_flags = FieldInfo::FieldFlags(_r.next_uint()); >> >> These callers don't need to know about _r even though they're in FieldInfoStream. And then you don't need to make these changes. > > The addition of read_name_and_signature() is a good level of abstraction. We must have some misunderstanding. This is not `FieldInfoStream`, this is `FieldInfoReader::read_field_info`, therefore I don't see any issue accessing private `FieldInfoReader::_r`. I've removed all friend classes from `FieldInfoReader` - noone outside `FieldInfoReader` can see `_r`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135129495 From rvansa at openjdk.org Mon Jun 9 06:55:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 06:55:00 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 19:04:40 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add more comments >> - Disable search table with dynamic CDS > > src/hotspot/share/utilities/packedTable.cpp line 49: > >> 47: assert((key & ~_key_mask) == 0, "key out of bounds"); >> 48: assert((value & ~_value_mask) == 0, "value out of bounds: %x vs. %x (%x)", value, _value_mask, ~_value_mask); >> 49: *reinterpret_cast(data + offset) = static_cast(key) | (static_cast(value) << _value_shift); > > How does this line not get a signal for unaligned write? >From what I could find, strict alignment checking must be explicitly enabled an aarch64. x86_64 does not require alignment either. In both cases, there might be a performance penalty. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135145604 From stefank at openjdk.org Mon Jun 9 07:41:52 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 9 Jun 2025 07:41:52 GMT Subject: RFR: 8358326: Use oopFactory array allocation In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 14:07:10 GMT, Coleen Phillimore wrote: > This patch removes cases of direct calls to {type,obj}ArrayKlass->allocate() and calls oopFactory::new_*array instead. It also renames {type,obj}ArrayKlass->allocate functions to allocate_klass and allocate_instance so it's more clear which allocation it's doing and to match InstanceKlass allocate functions, and makes these functions private with friends for Deoptimization and oopFactory. For JEP 401, arrays are being extended to support new formats and attributes and this reduces the call sites. > Tested with tier1-7. This looks good to me. I a few suggestions that you could take if you want to. src/hotspot/share/oops/objArrayKlass.hpp line 39: > 37: friend class JVMCIVMStructs; > 38: friend class oopFactory; > 39: friend class Deoptimization; If you want you could consider sorting the friend declarations (here and in the other place where you added it) src/hotspot/share/oops/objArrayKlass.hpp line 81: > 79: int n, Klass* element_klass, TRAPS); > 80: > 81: objArrayOop allocate(int length, TRAPS); Do you think `multi_allocate` will need a better name in the future? src/hotspot/share/runtime/reflection.cpp line 352: > 350: if (type == T_VOID) { > 351: THROW_NULL(vmSymbols::java_lang_IllegalArgumentException()); > 352: } I was first wondering where this came from but I now see that this was duplicated from `basic_type_mirror_to_arrayklass`. I wonder if this could could be deduplicated by moving this check into `basic_type_mirror_to_basic_type` and then removed from -`basic_type_mirror_to_arrayklass`: static BasicType basic_type_mirror_to_basic_type(oop basic_type_mirror, TRAPS) { assert(java_lang_Class::is_primitive(basic_type_mirror), "just checking"); if (type == T_VOID) { THROW_NULL(vmSymbols::java_lang_IllegalArgumentException()); } return java_lang_Class::primitive_type(basic_type_mirror); } static Klass* basic_type_mirror_to_arrayklass(oop basic_type_mirror, TRAPS) { BasicType type = basic_type_mirror_to_basic_type(basic_type_mirror); return Universe::typeArrayKlass(type); } And then this code could be a two-liner again: if (java_lang_Class::is_primitive(element_mirror)) { BasicType type = basic_type_mirror_to_basic_type(element_mirror); return oopFactory::new_typeArray(type, length, CHECK_NULL); } ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25590#pullrequestreview-2909210112 PR Review Comment: https://git.openjdk.org/jdk/pull/25590#discussion_r2135207589 PR Review Comment: https://git.openjdk.org/jdk/pull/25590#discussion_r2135211890 PR Review Comment: https://git.openjdk.org/jdk/pull/25590#discussion_r2135202824 From aph at openjdk.org Mon Jun 9 09:01:01 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 9 Jun 2025 09:01:01 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Thu, 5 Jun 2025 07:15:34 GMT, Liming Liu wrote: >> This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. >> >> 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. >> >> 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. >> >> The performance regressions and improvements were measured with the following microbenchmarks: >> org.openjdk.bench.java.util.TestCRC32.testCRC32Update >> org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate >> >> Ran the following JTReg tests on Ampere1 and did not find problems: >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java > > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Add the message for the assertions Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25609#pullrequestreview-2909415431 From rvansa at openjdk.org Mon Jun 9 09:08:59 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 09:08:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 19:22:10 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add more comments >> - Disable search table with dynamic CDS > > src/hotspot/share/utilities/packedTable.cpp line 83: > >> 81: assert(mid >= low && mid < high, "integer overflow?"); >> 82: uint64_t element = read_element(data, length, _element_bytes * mid); >> 83: uint32_t key = element & _key_mask; > > All this casting is hard to follow so I added this at the beginning of the file: > > #ifndef _WIN32 > #pragma GCC diagnostic warning "-Wconversion" > #endif > > and this line, 102, and 87 complain: > > > warning: conversion from 'uint64_t' {aka 'long unsigned int'} to 'uint32_t' {aka 'unsigned int'} may change value [-Wconversion] > 87 | uint32_t key = element & _key_mask; > | ~~~~~~~~^~~~~~~~~~~ > > > If the value is okay to cast to uint32_t, which I believe it is, use checked_cast from checkedCast.hpp. We cannot use `checked_cast`, because `element` can contain higher-order bits; but we want to ignore those. I can do an explicit `static_cast` and add a comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135343070 From azafari at openjdk.org Mon Jun 9 09:24:57 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 9 Jun 2025 09:24:57 GMT Subject: RFR: 8352140: UBSAN: fix the left shift of negative value in klass.hpp, array_layout_helper() [v3] In-Reply-To: References: Message-ID: On Sun, 8 Jun 2025 18:16:04 GMT, Kim Barrett wrote: >> Changes requested by kbarrett (Reviewer). > >> I made the `int->uint32` change at the minimum level, because the layout_helper constants are `public` in the `Kalss` class and are used all around Hotspot code, particularly in architecture dependent codes. Changing them to unsigned breaks the build (at SYMBOL generation phase). In addition, the two most-significant bits of the layout are used for `array` types and `lh < 0` (or more precisely `lh < _lh_neutral_value`) is used in C++ and assembly language of arch-dep code. @kimbarrett , can we limit the change here to `array_layout_helper` or should we proceed to use `uint32_t` instead? > > I'm inclined against uglifying the code for a ubsan issue just to keep the > change small. I'd rather we did the "right" fix instead, even though there's > some fannout. I don't know how urgent we consider fixing the ubsan issue > though. > > But if we're going to do a minimal change, I like the @dean-long suggestion here: > https://github.com/openjdk/jdk/pull/24184/files#r2090835514 > > The latest proposal also doesn't do anything about the issue I mentioned at > the end of this comment: > https://github.com/openjdk/jdk/pull/24184#discussion_r2055739014 > where there's a related problem with layout_helper_boolean_diffbit. > That could be treated as a separate bug, in which case just file it. > > I made the `int->uint32` change at the minimum level, because the layout_helper constants are `public` in the `Kalss` class and are used all around Hotspot code, particularly in architecture dependent codes. Changing them to unsigned breaks the build (at SYMBOL generation phase). In addition, the two most-significant bits of the layout are used for `array` types and `lh < 0` (or more precisely `lh < _lh_neutral_value`) is used in C++ and assembly language of arch-dep code. @kimbarrett , can we limit the change here to `array_layout_helper` or should we proceed to use `uint32_t` instead? > > I'm inclined against uglifying the code for a ubsan issue just to keep the change small. I'd rather we did the "right" fix instead, even though there's some fannout. I don't know how urgent we consider fixing the ubsan issue though. > > But if we're going to do a minimal change, I like the @dean-long suggestion here: https://github.com/openjdk/jdk/pull/24184/files#r2090835514 > > The latest proposal also doesn't do anything about the issue I mentioned at the end of this comment: [#24184 (comment)](https://github.com/openjdk/jdk/pull/24184#discussion_r2055739014) where there's a related problem with layout_helper_boolean_diffbit. That could be treated as a separate bug, in which case just file it. Thanks for your input. OK. I also agree. Then since the required changes are not in my expertise, I withdraw this PR and let qualified developers fix the issue. For the bogus code in `layout_helper_boolean_diffbit`, [this](https://bugs.openjdk.org/browse/JDK-8358957) is filed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24184#issuecomment-2955195732 From azafari at openjdk.org Mon Jun 9 09:25:00 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 9 Jun 2025 09:25:00 GMT Subject: RFR: 8352140: UBSAN: fix the left shift of negative value in klass.hpp, array_layout_helper() [v3] In-Reply-To: References: Message-ID: On Tue, 13 May 2025 12:54:11 GMT, Afshin Zafari wrote: >> The `array_layout_helper()` with `jint tag` as its first arg, is called with a `tag` whose sign-bit is always set and considered as negative. This negative value is UB in left-shift operation. Changing the type to `juint` fixes this. >> >> Tests: >> linux-x64-debug tier1 with UBSAN enabled. > > Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into _8352140_lshift_klass_hpp > - minimum change. > - Merge remote-tracking branch 'origin/master' into _8352140_lshift_klass_hpp > - 8352140: UBSAN: fix the left shift of negative value in klass.hpp, array_layout_helper() Withdrawn ------------- PR Comment: https://git.openjdk.org/jdk/pull/24184#issuecomment-2955197748 From azafari at openjdk.org Mon Jun 9 09:25:00 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 9 Jun 2025 09:25:00 GMT Subject: Withdrawn: 8352140: UBSAN: fix the left shift of negative value in klass.hpp, array_layout_helper() In-Reply-To: References: Message-ID: On Mon, 24 Mar 2025 09:52:43 GMT, Afshin Zafari wrote: > The `array_layout_helper()` with `jint tag` as its first arg, is called with a `tag` whose sign-bit is always set and considered as negative. This negative value is UB in left-shift operation. Changing the type to `juint` fixes this. > > Tests: > linux-x64-debug tier1 with UBSAN enabled. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24184 From rvansa at openjdk.org Mon Jun 9 10:13:38 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 10:13:38 GMT Subject: RFR: 8352075: Perf regression accessing fields [v24] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with five additional commits since the last revision: - Add debugging aids - Move comments to the implementation - Replace unaligned access with __builtin_memcpy - Add table to PackedTableLookup ctor - Helper struct naming conventions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/d75d6240..0226f470 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=22-23 Stats: 119 lines in 5 files changed: 59 ins; 19 del; 41 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Mon Jun 9 10:19:58 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 10:19:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Fri, 6 Jun 2025 16:13:11 GMT, Ioi Lam wrote: > While the benefit can be seen in a synthetic benchmark, do we have any data that shows a benefit in real world applications? @iklam Regrettably I cannot disclose the reproducer that came from a customer, but it is not synthetic - it caused problems when migrating to 21. I admit that I was not looking on the unaligned behaviour of platforms besides x86_64 and aarch64, where this was supposed to be OK (with perf penalty at worst). I've replaced this with `__builtin_memcpy` as I haven't found a different platform-agnostic way to convince GCC handle an unaligned read/write. I don't see any difference in performance on the reproducer, so it's probably good enough. Added the debugging print method as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2955355271 From epeter at openjdk.org Mon Jun 9 10:28:30 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 9 Jun 2025 10:28:30 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity Message-ID: **Past Work** With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. **This PR** I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. **Future Work:** In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. Testing passed tier1-3, with extra timeout factor 20. ------------- Commit messages: - missing return false - StoreNode Identity - StrEquals - rm rehash for Load, we have a general exception already - Merge branch 'master' into JDK-8347273-verify-IGVN-Ideal-Identity - CallJava in general - CallDynamicJava - Check unique, handle CmpP - fix Region and add worklist empty checks - fix a few more cases - ... and 58 more: https://git.openjdk.org/jdk/compare/65fda5c0...a12d49a0 Changes: https://git.openjdk.org/jdk/pull/22970/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347273 Stats: 859 lines in 5 files changed: 844 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From rvansa at openjdk.org Mon Jun 9 11:15:21 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 11:15:21 GMT Subject: RFR: 8352075: Perf regression accessing fields [v25] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Fallback to memcpy on MSVC ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/0226f470..909fe85e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=23-24 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Mon Jun 9 11:33:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 11:33:00 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Print debugging info for InstanceKlass ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/909fe85e..c3730b6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=24-25 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From jsjolen at openjdk.org Mon Jun 9 12:02:54 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 9 Jun 2025 12:02:54 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v2] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Thu, 22 May 2025 17:55:07 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Lois's comments Hi, > I think this looks like a nice improvement but I thought operands was going to turn into two arrays? Also there's probably a better name than 'operands' maybe bsm_operands? Yeah, I hope to do that in a future PR. Re: SA mirror constants. OK, I can do that, that makes sense. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25298#issuecomment-2955581389 From jsjolen at openjdk.org Mon Jun 9 12:09:38 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 9 Jun 2025 12:09:38 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v3] In-Reply-To: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: > Hi, > > The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: > > ```c++ > struct BSMAE { > u2 bootstrap_method_index; > u2 argument_count; > u2 arguments[argument_count]; > } > > > We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. > > Please consider! > > Testing: Currently GHA, running tier1-tier3 Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Coleen's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25298/files - new: https://git.openjdk.org/jdk/pull/25298/files/13e27259..2a5c820c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=01-02 Stats: 23 lines in 3 files changed: 19 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25298/head:pull/25298 PR: https://git.openjdk.org/jdk/pull/25298 From coleenp at openjdk.org Mon Jun 9 12:19:04 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 12:19:04 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Mon, 9 Jun 2025 06:52:41 GMT, Radim Vansa wrote: >> src/hotspot/share/utilities/packedTable.cpp line 49: >> >>> 47: assert((key & ~_key_mask) == 0, "key out of bounds"); >>> 48: assert((value & ~_value_mask) == 0, "value out of bounds: %x vs. %x (%x)", value, _value_mask, ~_value_mask); >>> 49: *reinterpret_cast(data + offset) = static_cast(key) | (static_cast(value) << _value_shift); >> >> How does this line not get a signal for unaligned write? > > From what I could find, strict alignment checking must be explicitly enabled an aarch64. x86_64 does not require alignment either. In both cases, there might be a performance penalty. Once I turned on hard signals for these unaligned accesses to find some performance problems (and I think I was debugging something on sparc). The OS handles these signals silently but it does/can cause performance loss. There must be a way to write this without all this C style casting with C++ syntax or a special memcpy. It would still be performant for field access even if the array was copied as a byte stream. There must be a more readable way to do this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135613762 From aboldtch at openjdk.org Mon Jun 9 13:22:52 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 9 Jun 2025 13:22:52 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> Message-ID: On Thu, 5 Jun 2025 14:34:38 GMT, Johan Sj?len wrote: >> I think that is equivalent, but I did not want to disrupt the general control flow of the algorithm so that it still closely resembles the original fdlibm code. >> >> I also initially misread your suggestion and now I am wondering if I can actually simplify it to a simple two line change: >> >> unsigned u_k = k + n; // avoid UB signed integer overflow >> k = (int) u_k; // safely assign to k >> >> does that bypass any UB? > > @dholmes-ora , if `k = (int) u_k;` does not say to the compiler that it can assume that `0 <= u_k < 2**31 - 1`, then this seems like a good changeset. Frankly, I would not trust myself in this matter, I find this very finicky. Initially I thought it would be nice if we could have expressed this as simply `k = wrapping_add(k, n);`. But it would require implementing such a method correctly, which seems non-trivial. (Some compilers have intrinsics for this I think). Might it is the case that casting the number here is also UB (if the unsigned number is to large). What we really need is a bit_cast (and use 2s complement, which I think is only guaranteed in C++20, but I think we already make assumptions about this, or we could use int32_t which at least should be 2s complement, not 100%). But then again looking at this code we also read from none active members of a union, which I think is a compiler language extension. (Only valid in C). If we keep the structure as it is, I would find the code more readable if it was `u_k <= (unsigned)std::numeric_limits::max()` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2135710875 From jsjolen at openjdk.org Mon Jun 9 13:32:01 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 9 Jun 2025 13:32:01 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <5ukCvH91tRfknvYy4d_8mLHGULPYSG71oZrWKNG4BhE=.034e2efe-9a11-4192-8cd6-fba1c7f0a0e4@github.com> On Mon, 9 Jun 2025 11:33:00 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Print debugging info for InstanceKlass There are more instances of these style errors that I've noted, please fix the other instances as well. src/hotspot/share/oops/fieldInfo.cpp line 132: > 130: // We use both name and signature during the comparison; while JLS require unique > 131: // names for fields, JVMS requires only unique name + signature combination. > 132: typedef struct { Style: Use C++ struct def (no typedef), don't use trailing _t in the name and write it as FieldPos. The fields are public, so no underscore as name prefix for them. src/hotspot/share/oops/fieldInfo.cpp line 140: > 138: > 139: class FieldInfoSupplier: public PackedTableBuilder::Supplier { > 140: private: Unnecessary private src/hotspot/share/oops/fieldInfo.cpp line 143: > 141: const field_pos_t* _positions; > 142: size_t _elements; > 143: public: Insert newline between public: and _elements ------------- PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2910061487 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135716601 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135716801 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135717226 From jsjolen at openjdk.org Mon Jun 9 13:39:59 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 9 Jun 2025 13:39:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <3bF8Ui8lil4ndeHVS6jIWhT1qi2TzJKgsIg7xPPrL4Y=.47570e4b-8d3f-4308-aab3-e0a712f3dae6@github.com> On Mon, 9 Jun 2025 11:33:00 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Print debugging info for InstanceKlass src/hotspot/share/oops/fieldStreams.hpp line 177: > 175: > 176: // Performs either a linear search or binary search through the stream > 177: // looking for a matchin name/signature combo matchin -> matching ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135735115 From rvansa at openjdk.org Mon Jun 9 13:43:59 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 13:43:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: <5ukCvH91tRfknvYy4d_8mLHGULPYSG71oZrWKNG4BhE=.034e2efe-9a11-4192-8cd6-fba1c7f0a0e4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <5ukCvH91tRfknvYy4d_8mLHGULPYSG71oZrWKNG4BhE=.034e2efe-9a11-4192-8cd6-fba1c7f0a0e4@github.com> Message-ID: <_L71mAfYE529CQldfa4pHtuU3bETQ3_2LmJTrH64zfg=.5d68285f-107d-41a3-bbd5-10bd99699b5f@github.com> On Mon, 9 Jun 2025 13:23:40 GMT, Johan Sj?len wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Print debugging info for InstanceKlass > > src/hotspot/share/oops/fieldInfo.cpp line 132: > >> 130: // We use both name and signature during the comparison; while JLS require unique >> 131: // names for fields, JVMS requires only unique name + signature combination. >> 132: typedef struct { > > Style: Use C++ struct def (no typedef), don't use trailing _t in the name and write it as FieldPos. The fields are public, so no underscore as name prefix for them. Now I am confused; @iklam just requested to use the underscores. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135747029 From coleenp at openjdk.org Mon Jun 9 13:54:51 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 13:54:51 GMT Subject: RFR: 8358326: Use oopFactory array allocation In-Reply-To: References: Message-ID: On Mon, 9 Jun 2025 07:36:32 GMT, Stefan Karlsson wrote: >> This patch removes cases of direct calls to {type,obj}ArrayKlass->allocate() and calls oopFactory::new_*array instead. It also renames {type,obj}ArrayKlass->allocate functions to allocate_klass and allocate_instance so it's more clear which allocation it's doing and to match InstanceKlass allocate functions, and makes these functions private with friends for Deoptimization and oopFactory. For JEP 401, arrays are being extended to support new formats and attributes and this reduces the call sites. >> Tested with tier1-7. > > src/hotspot/share/oops/objArrayKlass.hpp line 39: > >> 37: friend class JVMCIVMStructs; >> 38: friend class oopFactory; >> 39: friend class Deoptimization; > > If you want you could consider sorting the friend declarations (here and in the other place where you added it) Do we sort friends? The sorting looks funny since VMStructs is usually at the beginning. class ObjArrayKlass : public ArrayKlass { - friend class VMStructs; + friend class Deoptimization; friend class JVMCIVMStructs; friend class oopFactory; - friend class Deoptimization; + friend class VMStructs; > src/hotspot/share/oops/objArrayKlass.hpp line 81: > >> 79: int n, Klass* element_klass, TRAPS); >> 80: >> 81: objArrayOop allocate(int length, TRAPS); > > Do you think `multi_allocate` will need a better name in the future? I thought if changing multi_allocate_instance so that it's clear that it's an instance, but decided to limit this. Maybe this would be helpful but the allocate() function was the most confusing to me, that's why I picked that one. > src/hotspot/share/runtime/reflection.cpp line 352: > >> 350: if (type == T_VOID) { >> 351: THROW_NULL(vmSymbols::java_lang_IllegalArgumentException()); >> 352: } > > I was first wondering where this came from but I now see that this was duplicated from `basic_type_mirror_to_arrayklass`. I wonder if this could could be deduplicated by moving this check into `basic_type_mirror_to_basic_type` and then removed from -`basic_type_mirror_to_arrayklass`: > > > static BasicType basic_type_mirror_to_basic_type(oop basic_type_mirror, TRAPS) { > assert(java_lang_Class::is_primitive(basic_type_mirror), > "just checking"); > > if (type == T_VOID) { > THROW_NULL(vmSymbols::java_lang_IllegalArgumentException()); > } > > return java_lang_Class::primitive_type(basic_type_mirror); > } > > static Klass* basic_type_mirror_to_arrayklass(oop basic_type_mirror, TRAPS) { > BasicType type = basic_type_mirror_to_basic_type(basic_type_mirror, CHECK_NULL); > return Universe::typeArrayKlass(type); > } > > And then this code could be a two-liner again: > > if (java_lang_Class::is_primitive(element_mirror)) { > BasicType type = basic_type_mirror_to_basic_type(element_mirror, CHECK_NULL); > return oopFactory::new_typeArray(type, length, CHECK_NULL); > } Unfortunately the caller to basic_type_mirror_to_basic_type() can legitimately return T_VOID for the caller in reflect_method, so that's why I had to duplicate the exception code. Maybe a future enhancement would be to move these to javaClasses.hpp in java_lang_Class, where it knows all about is_primitive types, and boxing classes, which I guess boxing T_VOID is a thing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25590#discussion_r2135760802 PR Review Comment: https://git.openjdk.org/jdk/pull/25590#discussion_r2135763584 PR Review Comment: https://git.openjdk.org/jdk/pull/25590#discussion_r2135758517 From rvansa at openjdk.org Mon Jun 9 13:56:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 13:56:00 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: <_L71mAfYE529CQldfa4pHtuU3bETQ3_2LmJTrH64zfg=.5d68285f-107d-41a3-bbd5-10bd99699b5f@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <5ukCvH91tRfknvYy4d_8mLHGULPYSG71oZrWKNG4BhE=.034e2efe-9a11-4192-8cd6-fba1c7f0a0e4@github.com> <_L71mAfYE529CQldfa4pHtuU3bETQ3_2LmJTrH64zfg=.5d68285f-107d-41a3-bbd5-10bd99699b5f@github.com> Message-ID: On Mon, 9 Jun 2025 13:41:01 GMT, Radim Vansa wrote: >> src/hotspot/share/oops/fieldInfo.cpp line 132: >> >>> 130: // We use both name and signature during the comparison; while JLS require unique >>> 131: // names for fields, JVMS requires only unique name + signature combination. >>> 132: typedef struct { >> >> Style: Use C++ struct def (no typedef), don't use trailing _t in the name and write it as FieldPos. The fields are public, so no underscore as name prefix for them. > > Now I am confused; @iklam just requested to use the underscores. >From https://wiki.openjdk.org/display/HotSpot/StyleGuide : > [#Names](https://wiki.openjdk.org/display/HotSpot/StyleGuide#StyleGuide-Names) Instance variable names start with underscore "_", classes start with upper case letter, local functions are all lower case, all must have meaningful names. No mention of distinction based on public/private ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135769155 From rvansa at openjdk.org Mon Jun 9 13:56:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 13:56:00 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Mon, 9 Jun 2025 12:16:38 GMT, Coleen Phillimore wrote: >> From what I could find, strict alignment checking must be explicitly enabled an aarch64. x86_64 does not require alignment either. In both cases, there might be a performance penalty. > > Once I turned on hard signals for these unaligned accesses to find some performance problems (and I think I was debugging something on sparc). The OS handles these signals silently but it does/can cause performance loss. There must be a way to write this without all this C style casting with C++ syntax or a special memcpy. It would still be performant for field access even if the array was copied as a byte stream. There must be a more readable way to do this. What's wrong about `memcpy`, or rather the builtin version? Naturally I could write a for cycle copying the bytes, and rely on the compiler to optimize that out anyway, but I think that this makes the intention clear. If the handling was done through OS, I guess that the penalty would be actually quite severe. I could have tested the previous version on aarch64 e.g. in AWS, though now there's no casting of pointers anymore. When we have a final version, I could set up a build in AWS and report performance data from there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135764757 From jsjolen at openjdk.org Mon Jun 9 13:58:09 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 9 Jun 2025 13:58:09 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v4] In-Reply-To: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: > Hi, > > The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: > > ```c++ > struct BSMAE { > u2 bootstrap_method_index; > u2 argument_count; > u2 arguments[argument_count]; > } > > > We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. > > Please consider! > > Testing: Currently GHA, running tier1-tier3 Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Move it to public ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25298/files - new: https://git.openjdk.org/jdk/pull/25298/files/2a5c820c..1c7484d7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=02-03 Stats: 15 lines in 1 file changed: 7 ins; 8 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25298/head:pull/25298 PR: https://git.openjdk.org/jdk/pull/25298 From rvansa at openjdk.org Mon Jun 9 14:14:00 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 14:14:00 GMT Subject: RFR: 8352075: Perf regression accessing fields [v27] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Fix coding style ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/c3730b6e..bd186c4b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=25-26 Stats: 12 lines in 2 files changed: 2 ins; 2 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From coleenp at openjdk.org Mon Jun 9 14:16:39 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 14:16:39 GMT Subject: RFR: 8358326: Use oopFactory array allocation [v2] In-Reply-To: References: Message-ID: > This patch removes cases of direct calls to {type,obj}ArrayKlass->allocate() and calls oopFactory::new_*array instead. It also renames {type,obj}ArrayKlass->allocate functions to allocate_klass and allocate_instance so it's more clear which allocation it's doing and to match InstanceKlass allocate functions, and makes these functions private with friends for Deoptimization and oopFactory. For JEP 401, arrays are being extended to support new formats and attributes and this reduces the call sites. > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Sort our friends. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25590/files - new: https://git.openjdk.org/jdk/pull/25590/files/2da2b11b..2ccaf422 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25590&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25590&range=00-01 Stats: 6 lines in 2 files changed: 2 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25590.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25590/head:pull/25590 PR: https://git.openjdk.org/jdk/pull/25590 From coleenp at openjdk.org Mon Jun 9 14:24:59 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 14:24:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Mon, 9 Jun 2025 06:40:48 GMT, Radim Vansa wrote: >> The addition of read_name_and_signature() is a good level of abstraction. > > We must have some misunderstanding. This is not `FieldInfoStream`, this is `FieldInfoReader::read_field_info`, therefore I don't see any issue accessing private `FieldInfoReader::_r`. I've removed all friend classes from `FieldInfoReader` - noone outside `FieldInfoReader` can see `_r`. Sorry, it's FieldInfoReader. I think the next_uint() is a helpful function in that class to keep the _r.thing() pattern to a minimum. I don't think the changes here are an improvement. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135820196 From jsjolen at openjdk.org Mon Jun 9 14:35:57 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 9 Jun 2025 14:35:57 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <5ukCvH91tRfknvYy4d_8mLHGULPYSG71oZrWKNG4BhE=.034e2efe-9a11-4192-8cd6-fba1c7f0a0e4@github.com> <_L71mAfYE529CQldfa4pHtuU3bETQ3_2LmJTrH64zfg=.5d68285f-107d-41a3-bbd5-10bd99699b5f@github.com> Message-ID: On Mon, 9 Jun 2025 13:53:26 GMT, Radim Vansa wrote: >> Now I am confused; @iklam just requested to use the underscores. > > From https://wiki.openjdk.org/display/HotSpot/StyleGuide : >> [#Names](https://wiki.openjdk.org/display/HotSpot/StyleGuide#StyleGuide-Names) Instance variable names start with underscore "_", classes start with upper case letter, local functions are all lower case, all must have meaningful names. > > No mention of distinction based on public/private Well, now I am surprised :-). Thank you for looking that up, I've probably misunderstood the style and applied my own advice incorrectly. Sorry about the conflicting review comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135839913 From coleenp at openjdk.org Mon Jun 9 14:41:59 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 14:41:59 GMT Subject: RFR: 8352075: Perf regression accessing fields [v27] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <836kML8ZJKP4kPwN6OX9Z4Nuq8Iz39-2GYLdmcx5xbY=.ba8459e8-811c-4dbd-9f8e-49d633b2cd4f@github.com> On Mon, 9 Jun 2025 14:14:00 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Fix coding style src/hotspot/share/utilities/packedTable.cpp line 64: > 62: assert((value & ~_value_mask) == 0, "value out of bounds: %x vs. %x (%x)", value, _value_mask, ~_value_mask); > 63: uint64_t element = static_cast(key) | (static_cast(value) << _value_shift); > 64: __builtin_memcpy(table + offset, &element, _element_bytes); Looking at this is context makes more sense. We don't have __builtin_memcpy in the sources, just memcpy. Assuming the platform will do the right thing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135850157 From coleenp at openjdk.org Mon Jun 9 14:48:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 14:48:58 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <5ukCvH91tRfknvYy4d_8mLHGULPYSG71oZrWKNG4BhE=.034e2efe-9a11-4192-8cd6-fba1c7f0a0e4@github.com> <_L71mAfYE529CQldfa4pHtuU3bETQ3_2LmJTrH64zfg=.5d68285f-107d-41a3-bbd5-10bd99699b5f@github.com> Message-ID: On Mon, 9 Jun 2025 14:32:52 GMT, Johan Sj?len wrote: >> From https://wiki.openjdk.org/display/HotSpot/StyleGuide : >>> [#Names](https://wiki.openjdk.org/display/HotSpot/StyleGuide#StyleGuide-Names) Instance variable names start with underscore "_", classes start with upper case letter, local functions are all lower case, all must have meaningful names. >> >> No mention of distinction based on public/private > > Well, now I am surprised :-). Thank you for looking that up, I've probably misunderstood the style and applied my own advice incorrectly. Sorry about the conflicting review comments. This typedef didn't really bother me because it's a local internal class. The coding style might be that it has to declared with camel case though. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135862604 From coleenp at openjdk.org Mon Jun 9 14:49:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 14:49:00 GMT Subject: RFR: 8352075: Perf regression accessing fields [v27] In-Reply-To: <836kML8ZJKP4kPwN6OX9Z4Nuq8Iz39-2GYLdmcx5xbY=.ba8459e8-811c-4dbd-9f8e-49d633b2cd4f@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <836kML8ZJKP4kPwN6OX9Z4Nuq8Iz39-2GYLdmcx5xbY=.ba8459e8-811c-4dbd-9f8e-49d633b2cd4f@github.com> Message-ID: <3Ncrgi9_PdaKzWTf5bgop2DFVKWLNgk7VjiL0tRgjMw=.09fffa29-3471-49fa-8a4e-5ff84176de41@github.com> On Mon, 9 Jun 2025 14:38:47 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix coding style > > src/hotspot/share/utilities/packedTable.cpp line 64: > >> 62: assert((value & ~_value_mask) == 0, "value out of bounds: %x vs. %x (%x)", value, _value_mask, ~_value_mask); >> 63: uint64_t element = static_cast(key) | (static_cast(value) << _value_shift); >> 64: __builtin_memcpy(table + offset, &element, _element_bytes); > > Looking at this is context makes more sense. We don't have __builtin_memcpy in the sources, just memcpy. Assuming the platform will do the right thing. The memcpy is better than what was there before. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135863534 From rvansa at openjdk.org Mon Jun 9 15:34:44 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 15:34:44 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Revert removing FieldInfoReader::next_uint() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/bd186c4b..5d646376 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=26-27 Stats: 12 lines in 2 files changed: 1 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Mon Jun 9 15:39:09 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 9 Jun 2025 15:39:09 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: <9REPDxd-Ua3sLPfTKU-ASHNGupdvzV5Jo5Ji1yGi5rE=.35c11385-5fd4-4763-a68d-283199f9e145@github.com> On Fri, 6 Jun 2025 11:01:34 GMT, Coleen Phillimore wrote: >> Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add more comments >> - Disable search table with dynamic CDS > > To integrate hotspot changes, you need two reviewers and people 'requesting changes' to withdraw their requests. Thank goodness the bots prevented this from being integrated. You need to wait for all the comments to be resolved. > This is a P3 bug so you have more time to get this integrated for JDK 25. I posted the schedule in the issue. The process is that this change would be integrated into the main repository (destined for JDK 26 and then slash-backported to JDK 25 a couple days later if testing is clean). > My tier 1-7 testing passes with the dynamic CDS patch above. @coleenp Can't find the comment to reply... I've replaced all `_r.next_uint()` with just `next_uint()`, it's a bikeshed argument. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2956157579 From coleenp at openjdk.org Mon Jun 9 15:53:06 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 15:53:06 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Mon, 9 Jun 2025 15:34:44 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Revert removing FieldInfoReader::next_uint() It's somewhat of a bikeshed argument but thank you for making it since there are many of us that work on these sources. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2956195206 From kbarrett at openjdk.org Mon Jun 9 16:20:07 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 9 Jun 2025 16:20:07 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Mon, 9 Jun 2025 15:34:44 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Revert removing FieldInfoReader::next_uint() Not really reviewing, just a couple of drive-by comments because I was asked to take a quick look. ------------- PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2910515862 From kbarrett at openjdk.org Mon Jun 9 16:20:08 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 9 Jun 2025 16:20:08 GMT Subject: RFR: 8352075: Perf regression accessing fields [v26] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <5ukCvH91tRfknvYy4d_8mLHGULPYSG71oZrWKNG4BhE=.034e2efe-9a11-4192-8cd6-fba1c7f0a0e4@github.com> <_L71mAfYE529CQldfa4pHtuU3bETQ3_2LmJTrH64zfg=.5d68285f-107d-41a3-bbd5-10bd99699b5f@github.com> Message-ID: On Mon, 9 Jun 2025 14:45:24 GMT, Coleen Phillimore wrote: >> Well, now I am surprised :-). Thank you for looking that up, I've probably misunderstood the style and applied my own advice incorrectly. Sorry about the conflicting review comments. > > This typedef didn't really bother me because it's a local internal class. The coding style might be that it has to declared with camel case though. This is C++, not C. There's no good reason to use a typedef of an anonymous struct. And yes to the underscores on even public members of PODs. There are lots of violations of both of those, but that's the current style after various discussions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135990161 From kbarrett at openjdk.org Mon Jun 9 16:20:09 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 9 Jun 2025 16:20:09 GMT Subject: RFR: 8352075: Perf regression accessing fields [v27] In-Reply-To: <3Ncrgi9_PdaKzWTf5bgop2DFVKWLNgk7VjiL0tRgjMw=.09fffa29-3471-49fa-8a4e-5ff84176de41@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <836kML8ZJKP4kPwN6OX9Z4Nuq8Iz39-2GYLdmcx5xbY=.ba8459e8-811c-4dbd-9f8e-49d633b2cd4f@github.com> <3Ncrgi9_PdaKzWTf5bgop2DFVKWLNgk7VjiL0tRgjMw=.09fffa29-3471-49fa-8a4e-5ff84176de41@github.com> Message-ID: On Mon, 9 Jun 2025 14:45:53 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/packedTable.cpp line 64: >> >>> 62: assert((value & ~_value_mask) == 0, "value out of bounds: %x vs. %x (%x)", value, _value_mask, ~_value_mask); >>> 63: uint64_t element = static_cast(key) | (static_cast(value) << _value_shift); >>> 64: __builtin_memcpy(table + offset, &element, _element_bytes); >> >> Looking at this is context makes more sense. We don't have __builtin_memcpy in the sources, just memcpy. Assuming the platform will do the right thing. > > The memcpy is better than what was there before. I wouldn't expect using __builtin_memcpy to even compile in Windows. Use memcpy. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2135997724 From fparain at openjdk.org Mon Jun 9 17:19:00 2025 From: fparain at openjdk.org (Frederic Parain) Date: Mon, 9 Jun 2025 17:19:00 GMT Subject: RFR: 8358326: Use oopFactory array allocation [v2] In-Reply-To: References: Message-ID: <3YetAk_NA9GY0VmiXbizzYK9ASxuUUGczeQvkE8k798=.2d6ab3b2-6358-4dcc-ab2f-69c8cd100d8e@github.com> On Mon, 9 Jun 2025 14:16:39 GMT, Coleen Phillimore wrote: >> This patch removes cases of direct calls to {type,obj}ArrayKlass->allocate() and calls oopFactory::new_*array instead. It also renames {type,obj}ArrayKlass->allocate functions to allocate_klass and allocate_instance so it's more clear which allocation it's doing and to match InstanceKlass allocate functions, and makes these functions private with friends for Deoptimization and oopFactory. For JEP 401, arrays are being extended to support new formats and attributes and this reduces the call sites. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Sort our friends. LGTM. Thank you for this cleanup. ------------- Marked as reviewed by fparain (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25590#pullrequestreview-2910737057 From coleenp at openjdk.org Mon Jun 9 18:35:56 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 18:35:56 GMT Subject: RFR: 8358326: Use oopFactory array allocation [v2] In-Reply-To: References: Message-ID: On Mon, 9 Jun 2025 14:16:39 GMT, Coleen Phillimore wrote: >> This patch removes cases of direct calls to {type,obj}ArrayKlass->allocate() and calls oopFactory::new_*array instead. It also renames {type,obj}ArrayKlass->allocate functions to allocate_klass and allocate_instance so it's more clear which allocation it's doing and to match InstanceKlass allocate functions, and makes these functions private with friends for Deoptimization and oopFactory. For JEP 401, arrays are being extended to support new formats and attributes and this reduces the call sites. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Sort our friends. Thanks for reviewing Stefan and Frederic. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25590#issuecomment-2956607253 From coleenp at openjdk.org Mon Jun 9 18:35:57 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 9 Jun 2025 18:35:57 GMT Subject: Integrated: 8358326: Use oopFactory array allocation In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 14:07:10 GMT, Coleen Phillimore wrote: > This patch removes cases of direct calls to {type,obj}ArrayKlass->allocate() and calls oopFactory::new_*array instead. It also renames {type,obj}ArrayKlass->allocate functions to allocate_klass and allocate_instance so it's more clear which allocation it's doing and to match InstanceKlass allocate functions, and makes these functions private with friends for Deoptimization and oopFactory. For JEP 401, arrays are being extended to support new formats and attributes and this reduces the call sites. > Tested with tier1-7. This pull request has now been integrated. Changeset: eb256deb Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/eb256deb8021d5b243ef782eb9e2622472909e97 Stats: 67 lines in 10 files changed: 23 ins; 14 del; 30 mod 8358326: Use oopFactory array allocation Reviewed-by: fparain, stefank ------------- PR: https://git.openjdk.org/jdk/pull/25590 From mablakatov at openjdk.org Mon Jun 9 19:25:52 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Mon, 9 Jun 2025 19:25:52 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches Message-ID: In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: | Metric | Before | After | Difference | |-------------|---------------|---------------|------------| | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | | | Sum: 6653848 | Sum: 6616344 | -0.56% | | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | | | Sum: 364376 | Sum: 308552 | -15.33% | Full jtreg passed on AArch64. ------------- Commit messages: - 8358329: AArch64: emit direct branches in static stubs for small code caches Changes: https://git.openjdk.org/jdk/pull/25702/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25702&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358329 Stats: 171 lines in 4 files changed: 166 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25702.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25702/head:pull/25702 PR: https://git.openjdk.org/jdk/pull/25702 From iklam at openjdk.org Mon Jun 9 21:44:41 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 9 Jun 2025 21:44:41 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Mon, 9 Jun 2025 15:34:44 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Revert removing FieldInfoReader::next_uint() src/hotspot/share/utilities/packedTable.cpp line 83: > 81: assert((element & ~((uint64_t) _key_mask | ((uint64_t) _value_mask << _value_shift))) == 0, "read too much"); > 82: return element; > 83: } Since `_element_bytes` can be smaller than 8, memcpy will not work on big endian. Why are you trying to optimize this? This PR already cuts down the iterations from O(n) to O(log(n)). You are already doing a lot at each iteration -- decoding the name_index and signature_index from the unsigned5 stream, looking up the symbols from the constant pool, etc. So a few bit operations in read_element isn't going to make any substantial difference: uint64_t element = 0; for (int i = 0; i < _elements_bytes; i++) { element <<= 8; element |= _table[offset++]; // Need to rewrite fill() accordingly. } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2136554610 From iklam at openjdk.org Mon Jun 9 23:21:46 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 9 Jun 2025 23:21:46 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v5] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x00007113d0400718} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000071142c02c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:120) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:103) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x00007113d0400650} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000071142c02c7b0 (main) > [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "foo2" at BCI: 6 > Exception 2 caught. > > > - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `foo()`), the `[exceptions]` log terminates, and we do not know about the `main()` method. > > - The `[exceptions,stacktrace]` log tries to omit duplicated stack traces -- the stack is printed only the first time when the exception is seen by the logging code (inside `bar2()`). > > **Concurrent Exceptions** > > Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > **Remaining Issues** > > 1. `_last_logged_exception` remembers only one exception. Concurrent exceptions may cause the stack to be printed repeatedly. This can be fixed by remembering the exception for each thread (by adding a field into `JavaThread`). Probably i... Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Avoid printing the same stack trace over and over again ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/39839d51..540b2da3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=03-04 Stats: 55 lines in 4 files changed: 45 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From iklam at openjdk.org Mon Jun 9 23:27:28 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 9 Jun 2025 23:27:28 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v4] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Fri, 30 May 2025 06:47:28 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Use -Xlog:exceptions+stacktrace instead; fixed typo in comments > > test/hotspot/jtreg/runtime/logging/ExceptionsTest.java line 48: > >> 46: static void analyzeOutputOn(ProcessBuilder pb) throws Exception { >> 47: OutputAnalyzer output = new OutputAnalyzer(pb.start()); >> 48: System.out.println(output.getStdout()); > > Debugging code? > > If you really want to always print the output then the more common pattern is to use `out.reportDiagnosticSummary()` after all the checks have been done and so the test has passed (failing tests will print it anyway). The output is printed only if the failure happens in `OutputAnalyzer::shouldMatch()`, etc. I've updated the test so failures can happen other ways, so I cannot rely on this anymore. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2136662352 From iklam at openjdk.org Mon Jun 9 23:35:49 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 9 Jun 2025 23:35:49 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v6] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x00007113d0400718} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000071142c02c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:120) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:103) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x00007113d0400650} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000071142c02c7b0 (main) > [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "foo2" at BCI: 6 > Exception 2 caught. > > > - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `foo()`), the `[exceptions]` log terminates, and we do not know about the `main()` method. > > - The `[exceptions,stacktrace]` log tries to omit duplicated stack traces -- the stack is printed only the first time when the exception is seen by the logging code (inside `bar2()`). > > **Concurrent Exceptions** > > Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > **Remaining Issues** > > 1. `_last_logged_exception` remembers only one exception. Concurrent exceptions may cause the stack to be printed repeatedly. This can be fixed by remembering the exception for each thread (by adding a field into `JavaThread`). Probably i... Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora comments - use JavaThread::current() instead ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/540b2da3..547171ae Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=04-05 Stats: 7 lines in 1 file changed: 1 ins; 3 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From iklam at openjdk.org Tue Jun 10 00:05:54 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 10 Jun 2025 00:05:54 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v7] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: <5hRU2QAv0w4NNzqUzIzTcz9YwJBPWg0MCCPifqHe_V8=.a7053f19-6303-42f2-88c5-7609fd083092@github.com> > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400810} 'baz2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.baz2(ExceptionsTest.java:142) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:135) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar2" at BCI: 6 > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 8 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:137) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400670} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions ] Found m... Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Print callstack for rethrown exceptions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/547171ae..cc451e7e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=05-06 Stats: 40 lines in 4 files changed: 25 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From iklam at openjdk.org Tue Jun 10 00:05:55 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 10 Jun 2025 00:05:55 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v4] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Fri, 30 May 2025 06:45:35 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Use -Xlog:exceptions+stacktrace instead; fixed typo in comments > > src/hotspot/share/utilities/exceptions.cpp line 620: > >> 618: if (st.is_enabled()) { >> 619: Thread* t = Thread::current_or_null(); >> 620: if (t != nullptr && t->is_Java_thread()) { // sanity > > Do we need this? If we just rely on assertions then all we need is to call `JavaThread::current`. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2136685631 From kbarrett at openjdk.org Tue Jun 10 03:49:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 10 Jun 2025 03:49:32 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> On Thu, 5 Jun 2025 07:48:03 GMT, David Holmes wrote: > This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. > > Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. > > Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: > > // Convert to unsigned to avoid signed integer overflow > [1] unsigned u_k = ((unsigned) k) + n; > > [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ > [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ > [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); > return x; > } > > [5] if (u_k <= (unsigned)-54) { > if (n > 50000) /* in case integer overflow in n+k */ > return hugeX*copysignA(hugeX,x); /*overflow*/ > else return tiny*copysignA(tiny,x); /*underflow*/ > } > [6] k = u_k + 54; /* subnormal result */ > set_high(&x, (hx&0x800fffff)|(k<<20)); > return x*twom54; > > > [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition > > [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range > > [3] Again we check `u_k` and adjust the range > > [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression > > [5] We check if `u_k` is logically less than what -54 would be > > [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` > > Thanks. I think I've come up with a simpler solution. See detailed comments. src/hotspot/share/runtime/sharedRuntimeMath.hpp line 118: > 116: unsigned u_k = ((unsigned) k) + n; > 117: > 118: if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ I think `(unsigned)INT_MAX` would be more explicit about what's going on. This is also starting to push my limits on sufficiently simple to be a one-line `if`, and even more-so with my suggested change. I note that this isn't distinguishing between (1) `n > 0` and `k + n` overflows and wraps around to negative `int` vs (2) `n < 0` and `k + n` is negative. And that makes later code (both pre-existing and changed) harder to understand. I _think_ better here would be `u_k > 0x7fe && n > 0` => overflow, with some later adjustments. Then, if the test fails and we're not huge, `k = (int)u_k;` and use `k` as before, dropping `u_k`, so discarding the remainder of the currently proposed changes. src/hotspot/share/runtime/sharedRuntimeMath.hpp line 126: > 124: if (u_k <= (unsigned)-54) { > 125: if (n > 50000) /* in case integer overflow in n+k */ > 126: return hugeX*copysignA(hugeX,x); /*overflow*/ This case isn't possible with my suggest change to limit the usage scope for `u_k`, and can be deleted. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25656#pullrequestreview-2911807382 PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2136803034 PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2136848223 From kbarrett at openjdk.org Tue Jun 10 03:49:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 10 Jun 2025 03:49:34 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <6iTb9oggPOb1xvGPSKmDir1kBRXx4bmQfj74rXymA-w=.2e3dd383-ed46-4d29-bc8d-57cdfb1e3e8c@github.com> <6lZvtF9H-3j_YpE9nYULVbO2KoedCqmgSHTIs6jy6iQ=.9b84c38e-bdef-4662-9bb3-65bff7659082@github.com> Message-ID: On Mon, 9 Jun 2025 13:20:16 GMT, Axel Boldt-Christmas wrote: > Might it is the case that casting the number here is also UB The casting is implementation-defined until C++20, and all supported platforms define it as the "obvious" two's-complement conversion. `INT_MAX` is equivalent to, but less wordy than using `std::numeric_limits::max()`. But this doesn't matter with my suggested change to limit the scope of use of `u_k`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2136813714 From kbarrett at openjdk.org Tue Jun 10 03:57:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 10 Jun 2025 03:57:28 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 07:48:03 GMT, David Holmes wrote: > This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. > > Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. > > Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: > > // Convert to unsigned to avoid signed integer overflow > [1] unsigned u_k = ((unsigned) k) + n; > > [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ > [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ > [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); > return x; > } > > [5] if (u_k <= (unsigned)-54) { > if (n > 50000) /* in case integer overflow in n+k */ > return hugeX*copysignA(hugeX,x); /*overflow*/ > else return tiny*copysignA(tiny,x); /*underflow*/ > } > [6] k = u_k + 54; /* subnormal result */ > set_high(&x, (hx&0x800fffff)|(k<<20)); > return x*twom54; > > > [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition > > [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range > > [3] Again we check `u_k` and adjust the range > > [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression > > [5] We check if `u_k` is logically less than what -54 would be > > [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` > > Thanks. The JBS issue also talks about `copysignA` and suggests we should just use `copysign` if we're keeping `scalbnA`. Please either address that here or file a new issue for `copysignA`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2957602229 From darcy at openjdk.org Tue Jun 10 05:54:29 2025 From: darcy at openjdk.org (Joe Darcy) Date: Tue, 10 Jun 2025 05:54:29 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> Message-ID: On Tue, 10 Jun 2025 03:36:10 GMT, Kim Barrett wrote: >> This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. >> >> Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. >> >> Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: >> >> // Convert to unsigned to avoid signed integer overflow >> [1] unsigned u_k = ((unsigned) k) + n; >> >> [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ >> [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ >> [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); >> return x; >> } >> >> [5] if (u_k <= (unsigned)-54) { >> if (n > 50000) /* in case integer overflow in n+k */ >> return hugeX*copysignA(hugeX,x); /*overflow*/ >> else return tiny*copysignA(tiny,x); /*underflow*/ >> } >> [6] k = u_k + 54; /* subnormal result */ >> set_high(&x, (hx&0x800fffff)|(k<<20)); >> return x*twom54; >> >> >> [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition >> >> [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range >> >> [3] Again we check `u_k` and adjust the range >> >> [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression >> >> [5] We check if `u_k` is logically less than what -54 would be >> >> [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` >> >> Thanks. > > src/hotspot/share/runtime/sharedRuntimeMath.hpp line 126: > >> 124: if (u_k <= (unsigned)-54) { >> 125: if (n > 50000) /* in case integer overflow in n+k */ >> 126: return hugeX*copysignA(hugeX,x); /*overflow*/ > > This case (`n > 50000`) isn't possible with my suggest change to limit the usage scope for `u_k`, and > can be deleted. FYI, as an alternative, there is a Java-only implementation of scalb (and supporting functionality) in java.lang.Math that could be ported to C as another way to avoid this issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2136965576 From kbarrett at openjdk.org Tue Jun 10 06:32:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 10 Jun 2025 06:32:28 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> Message-ID: On Tue, 10 Jun 2025 05:51:39 GMT, Joe Darcy wrote: >> src/hotspot/share/runtime/sharedRuntimeMath.hpp line 126: >> >>> 124: if (u_k <= (unsigned)-54) { >>> 125: if (n > 50000) /* in case integer overflow in n+k */ >>> 126: return hugeX*copysignA(hugeX,x); /*overflow*/ >> >> This case (`n > 50000`) isn't possible with my suggest change to limit the usage scope for `u_k`, and >> can be deleted. > > FYI, as an alternative, there is a Java-only implementation of scalb (and supporting functionality) in java.lang.Math that could be ported to C as another way to avoid this issue. `java.lang.Math.scalb()` doesn't seem useful here. It just transforms the scale factor into a double power of 2 and then multiplies. It's not clear that would result in exactly the same result for all arguments as this. And this is (mostly) avoiding doing a double multiply for (perceived) performance reasons. (For all I know, the complexity here could swamp the cost of a double multiply.) Being certain of keeping the same results (on edge cases) is the only reason to stay with this (but fixed to remove UB), rather than just switching to using the C/C++ library `scalbn`. (And I don't know that there *is* any potential difference between `scalbnA` and `scalbn` that would actually matter. That would require more analysis than I've had time to do.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2137020078 From mbaesken at openjdk.org Tue Jun 10 06:42:46 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 10 Jun 2025 06:42:46 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v6] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Add comments to tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/f8458f10..3f64dffe Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=04-05 Stats: 6 lines in 6 files changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Tue Jun 10 06:42:46 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 10 Jun 2025 06:42:46 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v5] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Fri, 6 Jun 2025 11:34:04 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > exclude AbsPathInImage test from asan Hi David/Chris/Lucy - I added the comments to the tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2957857105 From mbaesken at openjdk.org Tue Jun 10 06:58:28 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 10 Jun 2025 06:58:28 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v6] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Tue, 10 Jun 2025 06:42:46 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comments to tests I created https://bugs.openjdk.org/browse/JDK-8359091 Tests using libjsig lib do not work when asan is enabled. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2957893175 From kbarrett at openjdk.org Tue Jun 10 07:10:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 10 Jun 2025 07:10:28 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 07:48:03 GMT, David Holmes wrote: > This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. > > Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. > > Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: > > // Convert to unsigned to avoid signed integer overflow > [1] unsigned u_k = ((unsigned) k) + n; > > [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ > [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ > [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); > return x; > } > > [5] if (u_k <= (unsigned)-54) { > if (n > 50000) /* in case integer overflow in n+k */ > return hugeX*copysignA(hugeX,x); /*overflow*/ > else return tiny*copysignA(tiny,x); /*underflow*/ > } > [6] k = u_k + 54; /* subnormal result */ > set_high(&x, (hx&0x800fffff)|(k<<20)); > return x*twom54; > > > [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition > > [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range > > [3] Again we check `u_k` and adjust the range > > [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression > > [5] We check if `u_k` is logically less than what -54 would be > > [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` > > Thanks. For the record, I still think it would be better to just delete `scalbnA` (and `copysignA`) and simply use the standard C `scalbn`. But I'm not going to insist on that, given the narrow scope of use and the challenges involved in figuring out whether that could result in any compatibility issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2957921835 From kbarrett at openjdk.org Tue Jun 10 07:15:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 10 Jun 2025 07:15:28 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 07:07:39 GMT, Kim Barrett wrote: > use the standard C `scalbn`. Long ago we couldn't use the standard C `scalbn` because (1) Visual Studio didn't provide it at all, and (2) we were using C++98/03 / C89 for gcc/clang and it was version conditionalized out as being a C99 function. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2957934925 From sspitsyn at openjdk.org Tue Jun 10 07:22:11 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 10 Jun 2025 07:22:11 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter Message-ID: The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. I treat this as a bug and doubt we need a CSR for this issue. Testing: N/A ------------- Commit messages: - 8358815: Exception event spec has stale reference to catch_klass parameter Changes: https://git.openjdk.org/jdk/pull/25710/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25710&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358815 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25710/head:pull/25710 PR: https://git.openjdk.org/jdk/pull/25710 From duke at openjdk.org Tue Jun 10 07:32:29 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 10 Jun 2025 07:32:29 GMT Subject: RFR: 8284017: Improve handshake filtering mechanism In-Reply-To: References: Message-ID: On Mon, 9 Jun 2025 03:35:05 GMT, David Holmes wrote: >> Hi, please consider the following enhancement: >> >> In this PR a new way of supplying multiple arguments to filter out / skip operations in handshake/safepoint poll is given. Multiple boolean arguments are combined in a hash table, where keys are taken from a new enum `HandshakeOperationProperty`, which is to be modified when there is a need for a new argument. >> >> Tested in GHA and tiers 1 - 3. > > This seems rather heavyweight - having to create a ResourceHashtable for every call site, dynamically - and quite cumbersome to write out at the call-site versus the simple boolean args. > > Do we have candidates for expanding the current set of "filters"? Two flags is quite manageable. Three is a stretch but still okay if we can take advantage of default args. Four or more would definitely cry out for some better mechanism, but I'm not sure this is it. > > Sorry. Thanks @dholmes-ora, You are correct, there have been no new candidates for expansions of filter set since the bug report was created. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25497#issuecomment-2957978101 From duke at openjdk.org Tue Jun 10 07:47:37 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 10 Jun 2025 07:47:37 GMT Subject: Withdrawn: 8284017: Improve handshake filtering mechanism In-Reply-To: References: Message-ID: On Wed, 28 May 2025 15:10:09 GMT, Anton Artemov wrote: > Hi, please consider the following enhancement: > > In this PR a new way of supplying multiple arguments to filter out / skip operations in handshake/safepoint poll is given. Multiple boolean arguments are combined in a hash table, where keys are taken from a new enum `HandshakeOperationProperty`, which is to be modified when there is a need for a new argument. > > Tested in GHA and tiers 1 - 3. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/25497 From rvansa at openjdk.org Tue Jun 10 07:58:38 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 07:58:38 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Mon, 9 Jun 2025 21:42:03 GMT, Ioi Lam wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert removing FieldInfoReader::next_uint() > > src/hotspot/share/utilities/packedTable.cpp line 83: > >> 81: assert((element & ~((uint64_t) _key_mask | ((uint64_t) _value_mask << _value_shift))) == 0, "read too much"); >> 82: return element; >> 83: } > > Since `_element_bytes` can be smaller than 8, memcpy will not work on big endian. > > Why are you trying to optimize this? This PR already cuts down the iterations from O(n) to O(log(n)). You are already doing a lot at each iteration -- decoding the name_index and signature_index from the unsigned5 stream, looking up the symbols from the constant pool, etc. So a few bit operations in read_element isn't going to make any substantial difference: > > > uint64_t element = 0; > for (int i = 0; i < _elements_bytes; i++) { > element <<= 8; > element |= _table[offset++]; // Need to rewrite fill() accordingly. > } The idea comes from this comment: https://github.com/openjdk/jdk/pull/24847#discussion_r2106110163 > you can load 1..8 bytes in a single (misaligned) memory operation, loading garbage into unused bytes, and then using shift or mask to clear the garbage. That may be faster than asking C++ to do a bunch of branchy logic and byte assembly on every access. I like that idea, though it appears that the plethora of platforms that JDK supports (but one cannot simply test) makes it difficult to express. Let's rely on the compiler to get the idea from for cycle and do the right thing on each platform, then... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137178186 From epeter at openjdk.org Tue Jun 10 08:21:29 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 08:21:29 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 14:43:40 GMT, Emanuel Peter wrote: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > Testing passed tier1-3, with extra timeout factor 20. @mhaessig Thanks for the idea! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22970#issuecomment-2958129493 From ayang at openjdk.org Tue Jun 10 08:40:14 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Jun 2025 08:40:14 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v11] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: - version - Merge branch 'master' into pgc-size-policy - revert-aliases - Merge branch 'master' into pgc-size-policy - merge - merge-fix - merge - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - ... and 6 more: https://git.openjdk.org/jdk/compare/7c9c8ba3...3a0502c3 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=10 Stats: 4373 lines in 31 files changed: 522 ins; 3452 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From mhaessig at openjdk.org Tue Jun 10 08:48:32 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Tue, 10 Jun 2025 08:48:32 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 14:43:40 GMT, Emanuel Peter wrote: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > Testing passed tier1-3, with extra timeout factor 20. Thank you for working on this and diligently noting the exceptions. That is quite the todo list you discovered ?. It is great to see more verification going in. I noted some minor things, mostly in the comments. With or without those, this looks good to me. src/hotspot/share/opto/c2_globals.hpp line 685: > 683: " B: verify that type(n) == n->Value() after IGVN" \ > 684: " C: verify all Node::Ideal were applied in IGVN" \ > 685: " D: verify all Node::Identity were applied in IGVN" \ Suggestion: " C: verify Node::Ideal did not miss opportunities" \ " D: verify Node::Identity did not miss opportunities" \ For me, "all `Node::Ideal` were applied" parsed weirdly, so I tried my hand at an alternative formulation. Feel free to ignore. src/hotspot/share/opto/phaseX.cpp line 1188: > 1186: } > 1187: > 1188: // Check that all Ideal optimizations that could be done were done. Suggestion: // Check that all Ideal optimizations that could be done were done. // Returns true if it found missed optimization opportunities and false otherwise and for exceptions. The return value was not immediately clear to me. src/hotspot/share/opto/phaseX.cpp line 1803: > 1801: } > 1802: tty->print_cr("The result after Ideal:"); > 1803: i->dump_bfs(1, nullptr, ""); Perhaps taking the tty lock might be appropriate, due to the amount of printing? Or do we know that nothing else is printing? src/hotspot/share/opto/phaseX.cpp line 1807: > 1805: } > 1806: > 1807: // Check that all Identity optimizations that could be done were done. Suggestion: // Check that all Identity optimizations that could be done were done. // Returns true if it found missed optimization opportunities and false otherwise and for exceptions. As above. src/hotspot/share/opto/phaseX.cpp line 1948: > 1946: > 1947: if (n->is_Load()) { > 1948: // LoadNode::Identity tries to look for an earier store value via Suggestion: // LoadNode::Identity tries to look for an earlier store value via src/hotspot/share/opto/phaseX.cpp line 1991: > 1989: n->dump_bfs(1, nullptr, ""); > 1990: tty->print_cr("New node:"); > 1991: i->dump_bfs(1, nullptr, ""); Suggestion: // The verificatin just found a new Identity that was not found during IGVN. tty->cr(); tty->print_cr("Missed Identity optimization:"); tty->print_cr("Old node:"); n->dump_bfs(1, nullptr, ""); tty->print_cr("New node:"); i->dump_bfs(1, nullptr, ""); The wording of the comment confused me a bit ? Also, perhaps taking the tty lock might be appropriate since you are printing a lot here? Or do we know that only verification is printing at this point? ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/22970#pullrequestreview-2912495849 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137284714 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137242097 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137266075 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137243146 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137259432 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137257629 From epeter at openjdk.org Tue Jun 10 08:56:46 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 08:56:46 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v2] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Manuel H?ssig ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/a12d49a0..1042ef54 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=00-01 Stats: 6 lines in 2 files changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From mhaessig at openjdk.org Tue Jun 10 09:12:35 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Tue, 10 Jun 2025 09:12:35 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v2] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 08:56:46 GMT, Emanuel Peter wrote: >> **Past Work** >> With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. >> >> **This PR** >> I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. >> >> I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. >> >> My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. >> >> **Future Work:** >> In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. >> >> I filed: >> [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) >> (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) >> >> Testing passed tier1-3, with extra timeout factor 20. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Manuel H?ssig src/hotspot/share/opto/phaseX.cpp line 1987: > 1985: } > 1986: > 1987: // The verificatin just found a new Identity that was not found during IGVN. Suggestion: // The verification just found a new Identity that was not found during IGVN. I guess, I suggested a typo... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137340707 From rvansa at openjdk.org Tue Jun 10 09:30:31 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 09:30:31 GMT Subject: RFR: 8352075: Perf regression accessing fields [v29] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <-T7PM80uoFrT7wol7RWu6ufJefRPZ-cWuW-qi646dHM=.996db406-7246-49b2-951a-603512194e34@github.com> > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: - Remove __builtin_memcpy - Fix coding style ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/5d646376..14f9bdf7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=28 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=27-28 Stats: 127 lines in 5 files changed: 5 ins; 14 del; 108 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From epeter at openjdk.org Tue Jun 10 09:32:46 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 09:32:46 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: - Update src/hotspot/share/opto/phaseX.cpp Co-authored-by: Manuel H?ssig - review suggestions, and handled a few more edge cases ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/1042ef54..5aa5444d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=01-02 Stats: 45 lines in 1 file changed: 35 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From epeter at openjdk.org Tue Jun 10 09:32:46 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 09:32:46 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 08:46:04 GMT, Manuel H?ssig wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases > > Thank you for working on this and diligently noting the exceptions. That is quite the todo list you discovered ?. It is great to see more verification going in. > > I noted some minor things, mostly in the comments. With or without those, this looks good to me. @mhaessig Thanks for reviewing! Yes this was rather a lot of gruesome work actually ? But worth it I think ? I applied you suggestions. And I found some more cases in tier4 and stress testing, so I handled those as well now. > src/hotspot/share/opto/phaseX.cpp line 1803: > >> 1801: } >> 1802: tty->print_cr("The result after Ideal:"); >> 1803: i->dump_bfs(1, nullptr, ""); > > Perhaps taking the tty lock might be appropriate, due to the amount of printing? Or do we know that nothing else is printing? good idea! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22970#issuecomment-2958370829 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137375811 From aph at openjdk.org Tue Jun 10 10:06:28 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 10 Jun 2025 10:06:28 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches In-Reply-To: References: Message-ID: On Mon, 9 Jun 2025 19:17:53 GMT, Mikhail Ablakatov wrote: > In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. > > This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. > > Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: > > | Metric | Before | After | Difference | > |-------------|---------------|---------------|------------| > | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | > | | Sum: 6653848 | Sum: 6616344 | -0.56% | > | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | > | | Sum: 364376 | Sum: 308552 | -15.33% | > > Full jtreg passed on AArch64. src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp line 106: > 104: } else { > 105: NativeJump::insert(method_holder->next_instruction_address(), entry); > 106: } Suggestion: MacroAssembler::pd_patch_instruction(method_holder->next_instruction_address(), entry); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2137449276 From aph at openjdk.org Tue Jun 10 10:06:29 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 10 Jun 2025 10:06:29 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 10:03:27 GMT, Andrew Haley wrote: >> In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. >> >> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. >> >> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: >> >> | Metric | Before | After | Difference | >> |-------------|---------------|---------------|------------| >> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | >> | | Sum: 6653848 | Sum: 6616344 | -0.56% | >> | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | >> | | Sum: 364376 | Sum: 308552 | -15.33% | >> >> Full jtreg passed on AArch64. > > src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp line 106: > >> 104: } else { >> 105: NativeJump::insert(method_holder->next_instruction_address(), entry); >> 106: } > > Suggestion: > > MacroAssembler::pd_patch_instruction(method_holder->next_instruction_address(), entry); Please also delete `NativeGeneralJump::insert_unconditional`, which is no longer used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2137450703 From chagedorn at openjdk.org Tue Jun 10 10:16:32 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 10 Jun 2025 10:16:32 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: Message-ID: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> On Tue, 10 Jun 2025 09:32:46 GMT, Emanuel Peter wrote: >> **Past Work** >> With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. >> >> **This PR** >> I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. >> >> I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. >> >> My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. >> >> **Future Work:** >> In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. >> >> I filed: >> [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) >> (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) >> >> Testing passed tier1-3, with extra timeout factor 20. > > Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: > > - Update src/hotspot/share/opto/phaseX.cpp > > Co-authored-by: Manuel H?ssig > - review suggestions, and handled a few more edge cases Great effort and analysis! I have some comments/questions. src/hotspot/share/opto/c2_globals.hpp line 686: > 684: " C: verify Node::Ideal did not miss opportunities" \ > 685: " D: verify Node::Identity did not miss opportunities" \ > 686: "A, B, C, and D in 0=off; 1=on") \ Why don't you use ABCD? It seems strange/unexpected to reverse the alphabetical order. src/hotspot/share/opto/phaseX.cpp line 1090: > 1088: if (is_verify_Ideal()) { failure |= verify_node_Ideal(n, false); } > 1089: if (is_verify_Ideal()) { failure |= verify_node_Ideal(n, true); } > 1090: if (is_verify_Identity()) { failure |= verify_node_Identity(n); } Suggestion: How about naming them `verify_Value/Ideal/Identity_for(n)`? src/hotspot/share/opto/phaseX.cpp line 1126: > 1124: node->dump(); > 1125: } > 1126: assert(_worklist.size() == 0, "igvn worklist must still be empty after verify"); The `_worklist` size does not seem to change after the bailout on L1114. So, we know that here the worklist is non-empty. Would `assert(false)` fit better? src/hotspot/share/opto/phaseX.cpp line 1196: > 1194: // Returns true if it found missed optimization opportunities and > 1195: // false otherwise (no missed optimization, or skipped verification). > 1196: bool PhaseIterGVN::verify_node_Ideal(Node* n, bool can_reshape) { General comment about your analysis for Ideal and Identity for why you disabled some of the verification. Very thorough and nicely explained! I'm wondering though if we should just open a tracking JBS issue (we could use JDK-8347273), dump the analysis there and refer to that JBS issue from the code for further details. This would allow us to use some permalinks from GitHub (we should probably not post them in the code directly) or extend the analysis with additional images etc. You also included a lot of best guesses (which is totally understandable!) which we might want to extend, comment further on in a discussion, or update because we know more about them. For that, we would need to update the actual code each time which seems unfortunate - and we might not fix some things because it does not seem worth the effort for tiny mistakes or updates. In JBS this comes for free. What do you think about that? Of course, in the end, it's also a trade-off. ------------- PR Review: https://git.openjdk.org/jdk/pull/22970#pullrequestreview-2912749936 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137403056 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137472659 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137441404 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137434095 From chagedorn at openjdk.org Tue Jun 10 10:16:33 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 10 Jun 2025 10:16:33 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 09:28:21 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/phaseX.cpp line 1803: >> >>> 1801: } >>> 1802: tty->print_cr("The result after Ideal:"); >>> 1803: i->dump_bfs(1, nullptr, ""); >> >> Perhaps taking the tty lock might be appropriate, due to the amount of printing? Or do we know that nothing else is printing? > > good idea! You could also define a `stringStream ss` and pass that one to `dump_bfs()`. We do a similar thing for `print_ideal_ir()` to keep everything in a block. As a bonus: We don't suffer from a tty lock being broken - even though that would not affect correctness. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137461406 From jsjolen at openjdk.org Tue Jun 10 10:17:38 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 10:17:38 GMT Subject: RFR: 8352075: Perf regression accessing fields [v29] In-Reply-To: <-T7PM80uoFrT7wol7RWu6ufJefRPZ-cWuW-qi646dHM=.996db406-7246-49b2-951a-603512194e34@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <-T7PM80uoFrT7wol7RWu6ufJefRPZ-cWuW-qi646dHM=.996db406-7246-49b2-951a-603512194e34@github.com> Message-ID: On Tue, 10 Jun 2025 09:30:31 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Remove __builtin_memcpy > - Fix coding style Some more comments. ------------- PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2912537540 From jsjolen at openjdk.org Tue Jun 10 10:17:38 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 10:17:38 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Mon, 9 Jun 2025 13:50:54 GMT, Radim Vansa wrote: >What's wrong about memcpy, or rather the builtin version? Doesn't regular `memcpy` compile into the builtin anyway? Aren't there LE/BE concerns when you do this type of computation? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137475323 From jsjolen at openjdk.org Tue Jun 10 10:17:40 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 10:17:40 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Mon, 9 Jun 2025 15:34:44 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Revert removing FieldInfoReader::next_uint() src/hotspot/share/utilities/packedTable.hpp line 56: > 54: // Packed table does NOT support duplicate keys. > 55: virtual bool next(uint32_t* key, uint32_t* value) = 0; > 56: }; Does it make sense to take the cost of an indirect call for each kv pair? You can't inline it, so the stack frame needs to be popped and pushed, and you're taking 2 registers (16 bytes) to give 8 bytes and 1 bit of information. We can amortize the cost by implementing this signature instead: virtual uint32_t next(Pair* kvs, uint32_t kvs_size); src/hotspot/share/utilities/packedTable.hpp line 69: > 67: // by the supplier (when Supplier::next() returns false the whole array should > 68: // be filled). > 69: void fill(u1* table, size_t table_length, Supplier &supplier) const; Let the ampersand hug the type. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137289775 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137269715 From jsjolen at openjdk.org Tue Jun 10 10:17:41 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 10:17:41 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Tue, 10 Jun 2025 08:34:49 GMT, Johan Sj?len wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert removing FieldInfoReader::next_uint() > > src/hotspot/share/utilities/packedTable.hpp line 69: > >> 67: // by the supplier (when Supplier::next() returns false the whole array should >> 68: // be filled). >> 69: void fill(u1* table, size_t table_length, Supplier &supplier) const; > > Let the ampersand hug the type. Also seems like this can be a static method? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137331314 From rvansa at openjdk.org Tue Jun 10 10:21:38 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 10:21:38 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Tue, 10 Jun 2025 09:05:00 GMT, Johan Sj?len wrote: >> src/hotspot/share/utilities/packedTable.hpp line 69: >> >>> 67: // by the supplier (when Supplier::next() returns false the whole array should >>> 68: // be filled). >>> 69: void fill(u1* table, size_t table_length, Supplier &supplier) const; >> >> Let the ampersand hug the type. > > Also seems like this can be a static method? The class constructor calculates the sizes, masks and bitshift for the arguments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137482532 From rvansa at openjdk.org Tue Jun 10 10:25:34 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 10:25:34 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> Message-ID: On Tue, 10 Jun 2025 10:14:21 GMT, Johan Sj?len wrote: >> What's wrong about `memcpy`, or rather the builtin version? Naturally I could write a for cycle copying the bytes, and rely on the compiler to optimize that out anyway, but I think that this makes the intention clear. >> >> If the handling was done through OS, I guess that the penalty would be actually quite severe. I could have tested the previous version on aarch64 e.g. in AWS, though now there's no casting of pointers anymore. >> >> When we have a final version, I could set up a build in AWS and report performance data from there. > >>What's wrong about memcpy, or rather the builtin version? > > Doesn't regular `memcpy` compile into the builtin anyway? Aren't there LE/BE concerns when you do this type of computation? >From what I read `memcpy` should be treated as builtin but in debugger I've seen deeper stacks. Anyway, this code is gone, I didn't really think about big endian archs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137489958 From aph at openjdk.org Tue Jun 10 10:39:28 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 10 Jun 2025 10:39:28 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches In-Reply-To: References: Message-ID: On Mon, 9 Jun 2025 19:17:53 GMT, Mikhail Ablakatov wrote: > In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. > > This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. > > Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: > > | Metric | Before | After | Difference | > |-------------|---------------|---------------|------------| > | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | > | | Sum: 6653848 | Sum: 6616344 | -0.56% | > | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | > | | Sum: 364376 | Sum: 308552 | -15.33% | > > Full jtreg passed on AArch64. src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp line 106: > 104: } else { > 105: NativeJump::insert(method_holder->next_instruction_address(), entry); > 106: } We're also calling `ICache::invalidate_range` twice, which is kinda lame. That perhaps doesn't matter because calls to `CompiledDirectCall::set_to_interpreted()` are fairly rare. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2137520422 From stuefe at openjdk.org Tue Jun 10 10:45:45 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 10 Jun 2025 10:45:45 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v6] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <2hZyAjtrXS2E4mkwrWTI6j4_mnLWYTBBN97-hQpuUDw=.4347bb71-7681-414e-8076-63cec0fb3f60@github.com> On Tue, 10 Jun 2025 06:56:16 GMT, Matthias Baesken wrote: > I created https://bugs.openjdk.org/browse/JDK-8359091 Tests using libjsig lib do not work when asan is enabled. Danke! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2958620216 From alanb at openjdk.org Tue Jun 10 11:06:28 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 10 Jun 2025 11:06:28 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter In-Reply-To: References: Message-ID: <3e2U0lztN76iKtiYtQaDlsPTknjRuubBJCq1azuj1Lk=.0a5e463b-9012-4a03-9a69-b2b0b0c6c601@github.com> On Tue, 10 Jun 2025 07:17:17 GMT, Serguei Spitsyn wrote: > The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. > I treat this as a bug and doubt we need a CSR for this issue. > > Testing: N/A src/hotspot/share/prims/jvmti.xml line 12874: > 12872: be reset by one of those native methods. > 12873: Similarly, exceptions that are reported as uncaught (catch_method > 12874: and catch_location set to 0) may in fact be caught by native code. catch_method is a jmethodID so null if uncaught. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25710#discussion_r2137595271 From dholmes at openjdk.org Tue Jun 10 11:33:29 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 10 Jun 2025 11:33:29 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 07:13:20 GMT, Kim Barrett wrote: >> For the record, I still think it would be better to just delete `scalbnA` (and >> `copysignA`) and simply use the standard C `scalbn`. But I'm not going to >> insist on that, given the narrow scope of use and the challenges involved in >> figuring out whether that could result in any compatibility issue. > >> use the standard C `scalbn`. > > Long ago we couldn't use the standard C `scalbn` because (1) Visual Studio > didn't provide it at all, and (2) we were using C++98/03 / C89 for gcc/clang > and it was version conditionalized out as being a C99 function. Lots to process here. I'm away for a few days so will get back to this next week. Thanks @kimbarrett ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2958809257 From epeter at openjdk.org Tue Jun 10 11:38:31 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 11:38:31 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: On Tue, 10 Jun 2025 09:42:50 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases > > src/hotspot/share/opto/c2_globals.hpp line 686: > >> 684: " C: verify Node::Ideal did not miss opportunities" \ >> 685: " D: verify Node::Identity did not miss opportunities" \ >> 686: "A, B, C, and D in 0=off; 1=on") \ > > Why don't you use ABCD? It seems strange/unexpected to reverse the alphabetical order. It would allow us to extend it further with a most significant bit of `E`, so that the order is `EDCBA`. If I do it in alphabetical order, then I would have to rename them. What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137655858 From epeter at openjdk.org Tue Jun 10 11:51:41 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 11:51:41 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: <2KUfq2OXSed2NMT3tYyh6wP2tQxAyUJmW38vXWUAURk=.2eb438a8-8b77-43a6-a478-0326f2b850cb@github.com> On Tue, 10 Jun 2025 10:00:12 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases > > src/hotspot/share/opto/phaseX.cpp line 1126: > >> 1124: node->dump(); >> 1125: } >> 1126: assert(_worklist.size() == 0, "igvn worklist must still be empty after verify"); > > The `_worklist` size does not seem to change after the bailout on L1114. So, we know that here the worklist is non-empty. Would `assert(false)` fit better? @chhagedorn While `assert(false)` would be correct, I think the check is a bit more expressive, that is why I left it in. But I guess the comment also says the same. Let me know what you prefer, I'm undecided. > src/hotspot/share/opto/phaseX.cpp line 1196: > >> 1194: // Returns true if it found missed optimization opportunities and >> 1195: // false otherwise (no missed optimization, or skipped verification). >> 1196: bool PhaseIterGVN::verify_node_Ideal(Node* n, bool can_reshape) { > > General comment about your analysis for Ideal and Identity for why you disabled some of the verification. Very thorough and nicely explained! I'm wondering though if we should just open a tracking JBS issue (we could use JDK-8347273), dump the analysis there and refer to that JBS issue from the code for further details. This would allow us to use some permalinks from GitHub (we should probably not post them in the code directly) or extend the analysis with additional images etc. You also included a lot of best guesses (which is totally understandable!) which we might want to extend, comment further on in a discussion, or update because we know more about them. For that, we would need to update the actual code each time which seems unfortunate - and we might not fix some things because it does not seem worth the effort for tiny mistakes or updates. In JBS this comes for free. > > What do you think about that? Of course, in the end, it's also a trade-off. Having the whole conversation in a single JBS issue sounds a bit tricky... it is more like 100 different issues each with their own conversation. And I don't yet know which nodes have to be fixed together, and which nodes have multiple problems. I would also prefer if the comments were in the code - it's not that bad to create a JBS issue and commit the comments. That way, everything is in the code, and not spread over multiple JBS issues and GitHub conversations. My suggestion is this: - Use the umbrella issue: [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) - There, we can do some basic triaging, and then file subtasks. - In the end, everything interesting to know needs to be committed back. Including text and pictures (ASCII). @chhagedorn Would that work for you? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137676530 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137674037 From epeter at openjdk.org Tue Jun 10 11:56:34 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 11:56:34 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: <49Ml0-TqyJe2aM-U82FuTy7dO6RKT6aqxSZKawJxvTA=.a554f0aa-64e4-419f-b823-152e5409414e@github.com> On Tue, 10 Jun 2025 10:12:50 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases > > src/hotspot/share/opto/phaseX.cpp line 1090: > >> 1088: if (is_verify_Ideal()) { failure |= verify_node_Ideal(n, false); } >> 1089: if (is_verify_Ideal()) { failure |= verify_node_Ideal(n, true); } >> 1090: if (is_verify_Identity()) { failure |= verify_node_Identity(n); } > > Suggestion: How about naming them `verify_Value/Ideal/Identity_for(n)`? I don't care. For me they are equally good :) I'll change it to your suggestion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137685509 From epeter at openjdk.org Tue Jun 10 12:02:14 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 12:02:14 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v4] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: - assert(false) for Christian - rename for Christian ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/5aa5444d..875ad17d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=02-03 Stats: 13 lines in 2 files changed: 0 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From epeter at openjdk.org Tue Jun 10 12:02:14 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 12:02:14 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: <2KUfq2OXSed2NMT3tYyh6wP2tQxAyUJmW38vXWUAURk=.2eb438a8-8b77-43a6-a478-0326f2b850cb@github.com> References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> <2KUfq2OXSed2NMT3tYyh6wP2tQxAyUJmW38vXWUAURk=.2eb438a8-8b77-43a6-a478-0326f2b850cb@github.com> Message-ID: On Tue, 10 Jun 2025 11:48:31 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/phaseX.cpp line 1126: >> >>> 1124: node->dump(); >>> 1125: } >>> 1126: assert(_worklist.size() == 0, "igvn worklist must still be empty after verify"); >> >> The `_worklist` size does not seem to change after the bailout on L1114. So, we know that here the worklist is non-empty. Would `assert(false)` fit better? > > @chhagedorn While `assert(false)` would be correct, I think the check is a bit more expressive, that is why I left it in. But I guess the comment also says the same. Let me know what you prefer, I'm undecided. Boah, I'll just change it. I don't care and that way we don't have to discuss it ;) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137695070 From epeter at openjdk.org Tue Jun 10 12:05:30 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 12:05:30 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v4] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 10:07:53 GMT, Christian Hagedorn wrote: >> good idea! > > You could also define a `stringStream ss` and pass that one to `dump_bfs()`. We do a similar thing for `print_ideal_ir()` to keep everything in a block. As a bonus: We don't suffer from a tty lock being broken - even though that would not affect correctness. Sure, I can make the change! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137706000 From epeter at openjdk.org Tue Jun 10 12:16:13 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 12:16:13 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v5] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: use stringStream instead of ttyLocker ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/875ad17d..d50775b8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=03-04 Stats: 41 lines in 1 file changed: 5 ins; 0 del; 36 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From epeter at openjdk.org Tue Jun 10 12:16:14 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 12:16:14 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: On Tue, 10 Jun 2025 10:13:25 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases > > Great effort and analysis! I have some comments/questions. @chhagedorn Thanks for reviewing! I addressed all your comments and suggestions :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/22970#issuecomment-2958970465 From dholmes at openjdk.org Tue Jun 10 12:45:28 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 10 Jun 2025 12:45:28 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v7] In-Reply-To: <5hRU2QAv0w4NNzqUzIzTcz9YwJBPWg0MCCPifqHe_V8=.a7053f19-6303-42f2-88c5-7609fd083092@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> <5hRU2QAv0w4NNzqUzIzTcz9YwJBPWg0MCCPifqHe_V8=.a7053f19-6303-42f2-88c5-7609fd083092@github.com> Message-ID: On Tue, 10 Jun 2025 00:05:54 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400810} 'baz2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.baz2(ExceptionsTest.java:142) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:135) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 0 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar2" at BCI: 6 >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 8 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:137) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400670} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 0 for thread 0x000074c46402c7b0 (m... > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Print callstack for rethrown exceptions I'm not sure about the de-duplication checks. It isn't clear to me what code structures will cause you see to see a single stacktrace and which will cause you to see many stacktraces. For example, will a try/finally causes a repeated stacktrace? I think yes. Sorry I am away for a few days so will have to re-examine this next week. ------------- PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2913388476 From dholmes at openjdk.org Tue Jun 10 12:45:29 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 10 Jun 2025 12:45:29 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v4] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Mon, 9 Jun 2025 23:24:23 GMT, Ioi Lam wrote: >> test/hotspot/jtreg/runtime/logging/ExceptionsTest.java line 48: >> >>> 46: static void analyzeOutputOn(ProcessBuilder pb) throws Exception { >>> 47: OutputAnalyzer output = new OutputAnalyzer(pb.start()); >>> 48: System.out.println(output.getStdout()); >> >> Debugging code? >> >> If you really want to always print the output then the more common pattern is to use `out.reportDiagnosticSummary()` after all the checks have been done and so the test has passed (failing tests will print it anyway). > > The output is printed only if the failure happens in `OutputAnalyzer::shouldMatch()`, etc. I've updated the test so failures can happen other ways, so I cannot rely on this anymore. Sorry I don't follow. I'm saying there is no need to print stdout here because upon failure it will be printed anyway. If you want to see stdout even in passing cases then do this after all the `shouldXXX` checks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2137788195 From jsjolen at openjdk.org Tue Jun 10 12:52:45 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 12:52:45 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: <9REPDxd-Ua3sLPfTKU-ASHNGupdvzV5Jo5Ji1yGi5rE=.35c11385-5fd4-4763-a68d-283199f9e145@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> <9REPDxd-Ua3sLPfTKU-ASHNGupdvzV5Jo5Ji1yGi5rE=.35c11385-5fd4-4763-a68d-283199f9e145@github.com> Message-ID: On Mon, 9 Jun 2025 15:36:14 GMT, Radim Vansa wrote: >> To integrate hotspot changes, you need two reviewers and people 'requesting changes' to withdraw their requests. Thank goodness the bots prevented this from being integrated. You need to wait for all the comments to be resolved. >> This is a P3 bug so you have more time to get this integrated for JDK 25. I posted the schedule in the issue. The process is that this change would be integrated into the main repository (destined for JDK 26 and then slash-backported to JDK 25 a couple days later if testing is clean). >> My tier 1-7 testing passes with the dynamic CDS patch above. > > @coleenp Can't find the comment to reply... I've replaced all `_r.next_uint()` with just `next_uint()`, it's a bikeshed argument. Hi @rvansa , What about this type of API for dealing with the compressed table? We do the 8 byte accesses as unsigned chars (important so that they're 0-extended and not sign extended) and write the compressed KV down in little-endian. We use bitwise OR to not squash whatever was there before. Comparing GCC x64 and PPC (power), it looks good. ```c++ #include #include #include #include // For this hard-coded example we use 2 byte keys // and 3 byte values -- for an element_bytes of 5 uint32_t key_mask = (1 << 16) - 1; uint32_t value_mask = (1 << 24) - 1; uint32_t value_shift = 16; uint32_t element_bytes = 5; uint64_t* kv_array; uint32_t unpack_key(uint64_t kv) { return kv & key_mask; } uint32_t unpack_value(uint64_t kv) { return (kv >> value_shift) & value_mask; } uint64_t pack_kv(uint64_t k, uint64_t v) { return k | (v << value_shift); } uint32_t align_down(uint32_t x, uint32_t align) { return x & -align; } uint32_t align_up(uint32_t x, uint32_t align) { return (x + (align - 1)) & -align; } uint64_t u64s_required(int n) { uint32_t bytes_required = element_bytes * n; return align_up(bytes_required, 8) / 8; } uint64_t read_u8(uint8_t* p) { uint64_t result = 0; result |= ((uint64_t)*(p + 0)) << (0 * 8); result |= ((uint64_t)*(p + 1)) << (1 * 8); result |= ((uint64_t)*(p + 2)) << (2 * 8); result |= ((uint64_t)*(p + 3)) << (3 * 8); result |= ((uint64_t)*(p + 4)) << (4 * 8); result |= ((uint64_t)*(p + 5)) << (5 * 8); result |= ((uint64_t)*(p + 6)) << (6 * 8); result |= ((uint64_t)*(p + 7)) << (7 * 8); return result; } void write_u8(uint8_t* p, uint64_t u8) { p[0] |= u8 & 0xFF; p[1] |= (u8 >> 8*1) & 0xFF; p[2] |= (u8 >> 8*2) & 0xFF; p[3] |= (u8 >> 8*3) & 0xFF; p[4] |= (u8 >> 8*4) & 0xFF; p[5] |= (u8 >> 8*5) & 0xFF; p[6] |= (u8 >> 8*6) & 0xFF; p[7] |= (u8 >> 8*7) & 0xFF; } uint64_t read(int n) { uint32_t byte_index = element_bytes * n; return read_u8(&reinterpret_cast(kv_array)[byte_index]); } void fill(int n, uint64_t kv) { uint32_t byte_index = element_bytes * n; write_u8(&reinterpret_cast(kv_array)[byte_index], kv); return; } int main() { int num_elts = 65536; uint64_t sz = u64s_required(num_elts); kv_array = (uint64_t*)malloc(sz * 8); for (int i = 0; i < sz; i++) { kv_array[i] = 0; } for (int i = 0; i < num_elts; i++) { uint64_t kv = pack_kv(i, 2 * i); fill(i, kv); } for (int i = 0; i < num_elts; i++) { uint64_t kv = read(i); uint32_t k = unpack_key(kv); uint32_t v = unpack_value(kv); printf("K: %d, V: %d\n", k, v); } } ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2959079093 From coleenp at openjdk.org Tue Jun 10 13:04:39 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 10 Jun 2025 13:04:39 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> <9REPDxd-Ua3sLPfTKU-ASHNGupdvzV5Jo5Ji1yGi5rE=.35c11385-5fd4-4763-a68d-283199f9e145@github.com> Message-ID: On Tue, 10 Jun 2025 12:49:55 GMT, Johan Sj?len wrote: >> @coleenp Can't find the comment to reply... I've replaced all `_r.next_uint()` with just `next_uint()`, it's a bikeshed argument. > > Hi @rvansa , > > What about this type of API for dealing with the compressed table? We do the 8 byte accesses as unsigned chars (important so that they're 0-extended and not sign extended) and write the compressed KV down in little-endian. We use bitwise OR to not squash whatever was there before. Comparing GCC x64 and PPC (power), it looks good. > > ```c++ > #include > #include > #include > #include > > > // For this hard-coded example we use 2 byte keys > // and 3 byte values -- for an element_bytes of 5 > uint32_t key_mask = (1 << 16) - 1; > uint32_t value_mask = (1 << 24) - 1; > uint32_t value_shift = 16; > > uint32_t element_bytes = 5; > > uint64_t* kv_array; > > uint32_t unpack_key(uint64_t kv) { > return kv & key_mask; > } > uint32_t unpack_value(uint64_t kv) { return (kv >> value_shift) & value_mask; } > > uint64_t pack_kv(uint64_t k, uint64_t v) { > return k | (v << value_shift); > } > > > uint32_t align_down(uint32_t x, uint32_t align) { return x & -align; } > uint32_t align_up(uint32_t x, uint32_t align) { > return (x + (align - 1)) & -align; > } > > uint64_t u64s_required(int n) { > uint32_t bytes_required = element_bytes * n; > return align_up(bytes_required, 8) / 8; > } > > uint64_t read_u8(uint8_t* p) { > uint64_t result = 0; > result |= ((uint64_t)*(p + 0)) << (0 * 8); > result |= ((uint64_t)*(p + 1)) << (1 * 8); > result |= ((uint64_t)*(p + 2)) << (2 * 8); > result |= ((uint64_t)*(p + 3)) << (3 * 8); > result |= ((uint64_t)*(p + 4)) << (4 * 8); > result |= ((uint64_t)*(p + 5)) << (5 * 8); > result |= ((uint64_t)*(p + 6)) << (6 * 8); > result |= ((uint64_t)*(p + 7)) << (7 * 8); > return result; > } > > void write_u8(uint8_t* p, uint64_t u8) { > p[0] |= u8 & 0xFF; > p[1] |= (u8 >> 8*1) & 0xFF; > p[2] |= (u8 >> 8*2) & 0xFF; > p[3] |= (u8 >> 8*3) & 0xFF; > p[4] |= (u8 >> 8*4) & 0xFF; > p[5] |= (u8 >> 8*5) & 0xFF; > p[6] |= (u8 >> 8*6) & 0xFF; > p[7] |= (u8 >> 8*7) & 0xFF; > } > > uint64_t read(int n) { > uint32_t byte_index = element_bytes * n; > return read_u8(&reinterpret_cast(kv_array)[byte_index]); > } > > void fill(int n, uint64_t kv) { > uint32_t byte_index = element_bytes * n; > write_u8(&reinterpret_cast(kv_array)[byte_index], kv); > return; > } > > int main() { > int num_elts = 65536; > uint64_t sz = u64s_required(num_elts); > kv_array = (uint64_t*)malloc(sz * 8); > for (int i = 0; i < sz; i++) { > kv_array[i] = 0; > } > > for (int i = 0; i < num_elts; i++) { > uint64_t kv = pack_kv(i, 2 * i); > fill(i, kv); > } > for (int i = 0; i < num_... Shouldn't the Copy package or memcpy do all of this? @jdksjolen I don't think this should be in the packedTable code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2959112151 From coleenp at openjdk.org Tue Jun 10 13:04:40 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 10 Jun 2025 13:04:40 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Tue, 10 Jun 2025 08:44:05 GMT, Johan Sj?len wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert removing FieldInfoReader::next_uint() > > src/hotspot/share/utilities/packedTable.hpp line 56: > >> 54: // Packed table does NOT support duplicate keys. >> 55: virtual bool next(uint32_t* key, uint32_t* value) = 0; >> 56: }; > > Does it make sense to take the cost of an indirect call for each kv pair? You can't inline it, so the stack frame needs to be popped and pushed, and you're taking 2 registers (16 bytes) to give 8 bytes and 1 bit of information. > > We can amortize the cost by implementing this signature instead: > > > virtual uint32_t next(Pair* kvs, uint32_t kvs_size); This was done this way with a "Supplier" because this package would be useful for other Unsigned5 packed types. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137846594 From duke at openjdk.org Tue Jun 10 13:06:50 2025 From: duke at openjdk.org (Manjunath S Matti.) Date: Tue, 10 Jun 2025 13:06:50 GMT Subject: RFR: 8359114: [s390x] Add z17 detection code Message-ID: Add support to detect the new generation of Z machine (z17). ------------- Commit messages: - Add support to detect the new generation of Z machine (z17). Changes: https://git.openjdk.org/jdk/pull/25718/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25718&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359114 Stats: 19 lines in 2 files changed: 12 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25718.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25718/head:pull/25718 PR: https://git.openjdk.org/jdk/pull/25718 From chagedorn at openjdk.org Tue Jun 10 13:07:37 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 10 Jun 2025 13:07:37 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: On Tue, 10 Jun 2025 11:36:17 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/c2_globals.hpp line 686: >> >>> 684: " C: verify Node::Ideal did not miss opportunities" \ >>> 685: " D: verify Node::Identity did not miss opportunities" \ >>> 686: "A, B, C, and D in 0=off; 1=on") \ >> >> Why don't you use ABCD? It seems strange/unexpected to reverse the alphabetical order. > > It would allow us to extend it further with a most significant bit of `E`, so that the order is `EDCBA`. If I do it in alphabetical order, then I would have to rename them. What do you think? Okay, I only looked at it from a user-perspective that you might mismatch the description to the value passed to the flag. What could help here is reversing the order you mention the modes: first D:, then C: etc. >> src/hotspot/share/opto/phaseX.cpp line 1196: >> >>> 1194: // Returns true if it found missed optimization opportunities and >>> 1195: // false otherwise (no missed optimization, or skipped verification). >>> 1196: bool PhaseIterGVN::verify_node_Ideal(Node* n, bool can_reshape) { >> >> General comment about your analysis for Ideal and Identity for why you disabled some of the verification. Very thorough and nicely explained! I'm wondering though if we should just open a tracking JBS issue (we could use JDK-8359103 >> )), dump the analysis there and refer to that JBS issue from the code for further details. This would allow us to use some permalinks from GitHub (we should probably not post them in the code directly) or extend the analysis with additional images etc. You also included a lot of best guesses (which is totally understandable!) which we might want to extend, comment further on in a discussion, or update because we know more about them. For that, we would need to update the actual code each time which seems unfortunate - and we might not fix some things because it does not seem worth the effort for tiny mistakes or updates. In JBS this comes for free. >> >> What do you think about that? Of course, in the end, it's also a trade-off. > > Having the whole conversation in a single JBS issue sounds a bit tricky... it is more like 100 different issues each with their own conversation. > > And I don't yet know which nodes have to be fixed together, and which nodes have multiple problems. > > I would also prefer if the comments were in the code - it's not that bad to create a JBS issue and commit the comments. That way, everything is in the code, and not spread over multiple JBS issues and GitHub conversations. > > My suggestion is this: > - Use the umbrella issue: [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) > C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > - There, we can do some basic triaging, and then file subtasks. > - In the end, everything interesting to know needs to be committed back. Including text and pictures (ASCII). > > @chhagedorn Would that work for you? I'm honestly not sure what the best way is. Currently, it feels a bit too verbose when also mentioning reproducers with command line options and failing tests which sounds more like things to keep track of in JBS. But I also see your point that having everything in the comments is quite handy and keeps everything in one part. Maybe we can find some middle ground when you move the "how to reproduce" to the umbrella JBS? The rest you can keep in the comments, I'm fine with that and see its benefit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137814182 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137840332 From chagedorn at openjdk.org Tue Jun 10 13:07:37 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 10 Jun 2025 13:07:37 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> <2KUfq2OXSed2NMT3tYyh6wP2tQxAyUJmW38vXWUAURk=.2eb438a8-8b77-43a6-a478-0326f2b850cb@github.com> Message-ID: On Tue, 10 Jun 2025 11:57:11 GMT, Emanuel Peter wrote: >> @chhagedorn While `assert(false)` would be correct, I think the check is a bit more expressive, that is why I left it in. But I guess the comment also says the same. Let me know what you prefer, I'm undecided. > > Boah, I'll just change it. I don't care and that way we don't have to discuss it ;) If you don't agree and the code is not wrong, I leave it up to you to decide for these subjective suggestions. I think what made me add the suggestion was that the assert in the end suggested that there is some logic that tries to empty the list between the bailout and the assert which was not the case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137826499 From chagedorn at openjdk.org Tue Jun 10 13:07:39 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 10 Jun 2025 13:07:39 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v5] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 12:16:13 GMT, Emanuel Peter wrote: >> **Past Work** >> With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. >> >> **This PR** >> I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. >> >> I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. >> >> My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. >> >> **Future Work:** >> In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. >> >> I filed: >> [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) >> (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) >> >> Testing passed tier1-3, with extra timeout factor 20. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > use stringStream instead of ttyLocker src/hotspot/share/runtime/flags/jvmFlagConstraintsCompiler.cpp line 303: > 301: JVMFlag::Error VerifyIterativeGVNConstraintFunc(uint value, bool verbose) { > 302: uint original_value = value; > 303: for (int i = 0; i < 4; i++) { You might want to consider adding a `const int max_modes = 4` or something like that and use it also below in the error message. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137843279 From duke at openjdk.org Tue Jun 10 13:19:29 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 10 Jun 2025 13:19:29 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:24:42 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static size_t available_memory(); >> static julong used_memory(); --> static size_t used_memory(); >> static julong free_memory(); --> static size_t free_memory(); >> static jlong total_swap_space(); --> static ptrdiff_t total_swap_space(); >> static jlong free_swap_space(); --> static ptrdiff_t free_swap_space(); >> static julong physical_memory(); --> static size_t physical_memory(); >> >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Fixed spaces in formatting in gc-related code. @tstuefe do you have any suggestions? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2959230269 From epeter at openjdk.org Tue Jun 10 13:19:32 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 13:19:32 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: On Tue, 10 Jun 2025 12:46:48 GMT, Christian Hagedorn wrote: >> It would allow us to extend it further with a most significant bit of `E`, so that the order is `EDCBA`. If I do it in alphabetical order, then I would have to rename them. What do you think? > > Okay, I only looked at it from a user-perspective that you might mismatch the description to the value passed to the flag. What could help here is reversing the order you mention the modes: first D:, then C: etc. @chhagedorn If I mention `D` on the same line as `=DCBA, with `, then we have to change 2 lines next time. But I suppose we will have to change all lines anyway because the indentation would change... What I would really want to avoid is to have to change the parsing. So the lowest significant bits have to stay where they are, but I can rename them. Why don't you make a suggestion how you would like it to look, and then I can apply it :) >> Having the whole conversation in a single JBS issue sounds a bit tricky... it is more like 100 different issues each with their own conversation. >> >> And I don't yet know which nodes have to be fixed together, and which nodes have multiple problems. >> >> I would also prefer if the comments were in the code - it's not that bad to create a JBS issue and commit the comments. That way, everything is in the code, and not spread over multiple JBS issues and GitHub conversations. >> >> My suggestion is this: >> - Use the umbrella issue: [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) >> C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) >> - There, we can do some basic triaging, and then file subtasks. >> - In the end, everything interesting to know needs to be committed back. Including text and pictures (ASCII). >> >> @chhagedorn Would that work for you? > > I'm honestly not sure what the best way is. Currently, it feels a bit too verbose when also mentioning reproducers with command line options and failing tests which sounds more like things to keep track of in JBS. But I also see your point that having everything in the comments is quite handy and keeps everything in one part. Maybe we can find some middle ground when you move the "how to reproduce" to the umbrella JBS? The rest you can keep in the comments, I'm fine with that and see its benefit. @chhagedorn And how do we link from code <-> JBS issue? How do we make sure that this stays up to date when the code changes around? Because I predict that this will all move a lot over the next months. Personally, I prefer the verbose character here. I spent a lot of time finding reproducers, and if we put them in JBS, they will most likely just get lost. That would be sad. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137865665 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137879139 From epeter at openjdk.org Tue Jun 10 13:19:33 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 13:19:33 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> <2KUfq2OXSed2NMT3tYyh6wP2tQxAyUJmW38vXWUAURk=.2eb438a8-8b77-43a6-a478-0326f2b850cb@github.com> Message-ID: On Tue, 10 Jun 2025 12:52:26 GMT, Christian Hagedorn wrote: >> Boah, I'll just change it. I don't care and that way we don't have to discuss it ;) > > If you don't agree and the code is not wrong, I leave it up to you to decide for these subjective suggestions. I think what made me add the suggestion was that the assert in the end suggested that there is some logic that tries to empty the list between the bailout and the assert which was not the case. It is already changed :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137882060 From epeter at openjdk.org Tue Jun 10 13:19:34 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 13:19:34 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v5] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 13:00:40 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> use stringStream instead of ttyLocker > > src/hotspot/share/runtime/flags/jvmFlagConstraintsCompiler.cpp line 303: > >> 301: JVMFlag::Error VerifyIterativeGVNConstraintFunc(uint value, bool verbose) { >> 302: uint original_value = value; >> 303: for (int i = 0; i < 4; i++) { > > You might want to consider adding a `const int max_modes = 4` or something like that and use it also below in the error message. Sure, I can do that :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2137883591 From epeter at openjdk.org Tue Jun 10 13:24:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 13:24:52 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v6] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: max_modes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/d50775b8..97af8205 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=04-05 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From tschatzl at openjdk.org Tue Jun 10 13:33:27 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 10 Jun 2025 13:33:27 GMT Subject: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v39] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 55 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * ayang review: remove sweep_epoch - Merge branch 'master' into card-table-as-dcq-merge - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * ayang review (part 2 - yield duration changes) - * ayang review (part 1) - * indentation fix - * remove support for 32 bit x86 in the barrier generation code, following latest changes from @shade - Merge branch 'master' into 8342382-card-table-instead-of-dcq - ... and 45 more: https://git.openjdk.org/jdk/compare/0582bd29...c07a73db ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=38 Stats: 7085 lines in 111 files changed: 2568 ins; 3599 del; 918 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From tschatzl at openjdk.org Tue Jun 10 13:35:28 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 10 Jun 2025 13:35:28 GMT Subject: RFR: 8358294: Remove unnecessary GenAlignment In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 08:36:08 GMT, Albert Mingkun Yang wrote: > Simple replacement of `GenAlignment` with `SpaceAlignment`, because they always have the same value. Removing the former to reduce complexity. > > Test: tier1-3 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25577#pullrequestreview-2913607949 From duke at openjdk.org Tue Jun 10 13:36:50 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Tue, 10 Jun 2025 13:36:50 GMT Subject: RFR: 8359110: Log accumulated GC and process CPU time upon VM exit Message-ID: Add support to log CPU cost for GC during VM exit with `-Xlog:gc`. ------------- Commit messages: - Measure GC vtime on VM exit Changes: https://git.openjdk.org/jdk/pull/25724/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25724&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359110 Stats: 178 lines in 15 files changed: 178 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25724.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25724/head:pull/25724 PR: https://git.openjdk.org/jdk/pull/25724 From rvansa at openjdk.org Tue Jun 10 13:40:38 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 13:40:38 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> <9REPDxd-Ua3sLPfTKU-ASHNGupdvzV5Jo5Ji1yGi5rE=.35c11385-5fd4-4763-a68d-283199f9e145@github.com> Message-ID: On Tue, 10 Jun 2025 12:49:55 GMT, Johan Sj?len wrote: >> @coleenp Can't find the comment to reply... I've replaced all `_r.next_uint()` with just `next_uint()`, it's a bikeshed argument. > > Hi @rvansa , > > What about this type of API for dealing with the compressed table? We do the 8 byte accesses as unsigned chars (important so that they're 0-extended and not sign extended) and write the compressed KV down in little-endian. We use bitwise OR to not squash whatever was there before. Comparing GCC x64 and PPC (power), it looks good. > > ```c++ > #include > #include > #include > #include > > > // For this hard-coded example we use 2 byte keys > // and 3 byte values -- for an element_bytes of 5 > uint32_t key_mask = (1 << 16) - 1; > uint32_t value_mask = (1 << 24) - 1; > uint32_t value_shift = 16; > > uint32_t element_bytes = 5; > > uint64_t* kv_array; > > uint32_t unpack_key(uint64_t kv) { > return kv & key_mask; > } > uint32_t unpack_value(uint64_t kv) { return (kv >> value_shift) & value_mask; } > > uint64_t pack_kv(uint64_t k, uint64_t v) { > return k | (v << value_shift); > } > > > uint32_t align_down(uint32_t x, uint32_t align) { return x & -align; } > uint32_t align_up(uint32_t x, uint32_t align) { > return (x + (align - 1)) & -align; > } > > uint64_t u64s_required(int n) { > uint32_t bytes_required = element_bytes * n; > return align_up(bytes_required, 8) / 8; > } > > uint64_t read_u8(uint8_t* p) { > uint64_t result = 0; > result |= ((uint64_t)*(p + 0)) << (0 * 8); > result |= ((uint64_t)*(p + 1)) << (1 * 8); > result |= ((uint64_t)*(p + 2)) << (2 * 8); > result |= ((uint64_t)*(p + 3)) << (3 * 8); > result |= ((uint64_t)*(p + 4)) << (4 * 8); > result |= ((uint64_t)*(p + 5)) << (5 * 8); > result |= ((uint64_t)*(p + 6)) << (6 * 8); > result |= ((uint64_t)*(p + 7)) << (7 * 8); > return result; > } > > void write_u8(uint8_t* p, uint64_t u8) { > p[0] |= u8 & 0xFF; > p[1] |= (u8 >> 8*1) & 0xFF; > p[2] |= (u8 >> 8*2) & 0xFF; > p[3] |= (u8 >> 8*3) & 0xFF; > p[4] |= (u8 >> 8*4) & 0xFF; > p[5] |= (u8 >> 8*5) & 0xFF; > p[6] |= (u8 >> 8*6) & 0xFF; > p[7] |= (u8 >> 8*7) & 0xFF; > } > > uint64_t read(int n) { > uint32_t byte_index = element_bytes * n; > return read_u8(&reinterpret_cast(kv_array)[byte_index]); > } > > void fill(int n, uint64_t kv) { > uint32_t byte_index = element_bytes * n; > write_u8(&reinterpret_cast(kv_array)[byte_index], kv); > return; > } > > int main() { > int num_elts = 65536; > uint64_t sz = u64s_required(num_elts); > kv_array = (uint64_t*)malloc(sz * 8); > for (int i = 0; i < sz; i++) { > kv_array[i] = 0; > } > > for (int i = 0; i < num_elts; i++) { > uint64_t kv = pack_kv(i, 2 * i); > fill(i, kv); > } > for (int i = 0; i < num_... @jdksjolen We'd need to solve the case without padding at the end of the table (or add the padding to ensure that we're not accessing past allocated area). However, I did not really get what problem are you trying to solve? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2959298843 From rvansa at openjdk.org Tue Jun 10 13:48:37 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 13:48:37 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Tue, 10 Jun 2025 13:02:15 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/packedTable.hpp line 56: >> >>> 54: // Packed table does NOT support duplicate keys. >>> 55: virtual bool next(uint32_t* key, uint32_t* value) = 0; >>> 56: }; >> >> Does it make sense to take the cost of an indirect call for each kv pair? You can't inline it, so the stack frame needs to be popped and pushed, and you're taking 2 registers (16 bytes) to give 8 bytes and 1 bit of information. >> >> We can amortize the cost by implementing this signature instead: >> >> >> virtual uint32_t next(Pair* kvs, uint32_t kvs_size); > > This was done this way with a "Supplier" because this package would be useful for other Unsigned5 packed types. But then you'd need create an array of `Pair` in `create_search_table` and copy the data into that. You wouldn't need a Supplier at all, just pass the array and its length to the `fill` method. If you are worried about the virtual calls, I could do that. Alternatively, we could replace virtual calls with templated `fill` method. In that case the Comparator should be a template method as well, since that one is even more perf-critical? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2137952293 From jsjolen at openjdk.org Tue Jun 10 14:22:39 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 14:22:39 GMT Subject: RFR: 8352075: Perf regression accessing fields [v23] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3rzU9-HHk_qfFkI8DRejJ-8RR4N9ys12NjQvYKQV8-s=.b1c9959e-86c7-4ed0-8d81-04ea943f6d7a@github.com> <9REPDxd-Ua3sLPfTKU-ASHNGupdvzV5Jo5Ji1yGi5rE=.35c11385-5fd4-4763-a68d-283199f9e145@github.com> Message-ID: <2KPXTyX4hin3fF8MGa-7NgPkjYAG9UW11MVfPX8OAK0=.9dddca09-f542-42b0-9677-c73409fd52fb@github.com> On Tue, 10 Jun 2025 12:49:55 GMT, Johan Sj?len wrote: >> @coleenp Can't find the comment to reply... I've replaced all `_r.next_uint()` with just `next_uint()`, it's a bikeshed argument. > > Hi @rvansa , > > What about this type of API for dealing with the compressed table? We do the 8 byte accesses as unsigned chars (important so that they're 0-extended and not sign extended) and write the compressed KV down in little-endian. We use bitwise OR to not squash whatever was there before. Comparing GCC x64 and PPC (power), it looks good. > > ```c++ > #include > #include > #include > #include > > > // For this hard-coded example we use 2 byte keys > // and 3 byte values -- for an element_bytes of 5 > uint32_t key_mask = (1 << 16) - 1; > uint32_t value_mask = (1 << 24) - 1; > uint32_t value_shift = 16; > > uint32_t element_bytes = 5; > > uint64_t* kv_array; > > uint32_t unpack_key(uint64_t kv) { > return kv & key_mask; > } > uint32_t unpack_value(uint64_t kv) { return (kv >> value_shift) & value_mask; } > > uint64_t pack_kv(uint64_t k, uint64_t v) { > return k | (v << value_shift); > } > > > uint32_t align_down(uint32_t x, uint32_t align) { return x & -align; } > uint32_t align_up(uint32_t x, uint32_t align) { > return (x + (align - 1)) & -align; > } > > uint64_t u64s_required(int n) { > uint32_t bytes_required = element_bytes * n; > return align_up(bytes_required, 8) / 8; > } > > uint64_t read_u8(uint8_t* p) { > uint64_t result = 0; > result |= ((uint64_t)*(p + 0)) << (0 * 8); > result |= ((uint64_t)*(p + 1)) << (1 * 8); > result |= ((uint64_t)*(p + 2)) << (2 * 8); > result |= ((uint64_t)*(p + 3)) << (3 * 8); > result |= ((uint64_t)*(p + 4)) << (4 * 8); > result |= ((uint64_t)*(p + 5)) << (5 * 8); > result |= ((uint64_t)*(p + 6)) << (6 * 8); > result |= ((uint64_t)*(p + 7)) << (7 * 8); > return result; > } > > void write_u8(uint8_t* p, uint64_t u8) { > p[0] |= u8 & 0xFF; > p[1] |= (u8 >> 8*1) & 0xFF; > p[2] |= (u8 >> 8*2) & 0xFF; > p[3] |= (u8 >> 8*3) & 0xFF; > p[4] |= (u8 >> 8*4) & 0xFF; > p[5] |= (u8 >> 8*5) & 0xFF; > p[6] |= (u8 >> 8*6) & 0xFF; > p[7] |= (u8 >> 8*7) & 0xFF; > } > > uint64_t read(int n) { > uint32_t byte_index = element_bytes * n; > return read_u8(&reinterpret_cast(kv_array)[byte_index]); > } > > void fill(int n, uint64_t kv) { > uint32_t byte_index = element_bytes * n; > write_u8(&reinterpret_cast(kv_array)[byte_index], kv); > return; > } > > int main() { > int num_elts = 65536; > uint64_t sz = u64s_required(num_elts); > kv_array = (uint64_t*)malloc(sz * 8); > for (int i = 0; i < sz; i++) { > kv_array[i] = 0; > } > > for (int i = 0; i < num_elts; i++) { > uint64_t kv = pack_kv(i, 2 * i); > fill(i, kv); > } > for (int i = 0; i < num_... > Shouldn't the Copy package or memcpy do all of this? @jdksjolen I don't think this should be in the packedTable code. > @jdksjolen We'd need to solve the case without padding at the end of the table (or add the padding to ensure that we're not accessing past allocated area). However, I did not really get what problem are you trying to solve? My worry was that `memcpy` would have incorrect semantics when on a BE system. As the code no longer uses memcpy, we're fine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2959448541 From jsjolen at openjdk.org Tue Jun 10 14:32:37 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 14:32:37 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Tue, 10 Jun 2025 13:45:52 GMT, Radim Vansa wrote: >> This was done this way with a "Supplier" because this package would be useful for other Unsigned5 packed types. > > But then you'd need create an array of `Pair` in `create_search_table` and copy the data into that. You wouldn't need a Supplier at all, just pass the array and its length to the `fill` method. If you are worried about the virtual calls, I could do that. > > Alternatively, we could replace virtual calls with templated `fill` method. In that case the Comparator should be a template method as well, since that one is even more perf-critical? I think you misunderstand me, as what I'm proposing wouldn't requre creating an array in `create_search_table`. I'm saying that you do this: ```c++ // Note: we require the supplier to provide the elements in the final order as we can't easily sort // within this method - qsort() accepts only pure function as comparator. void PackedTableBuilder::fill(u1* table, size_t table_length, Supplier &supplier) const { Pair kvs[4]; size_t offset = 0; size_t len_read = 0; while (len = supplier.next(kvs, 4)) { // len tells you how many of the full capacity of kvs was used. } } Now each call of `Supplier::next` gives you up to 4 elements, quartering the amount of virtual calls necessary. >Alternatively, we could replace virtual calls with templated fill method. In that case the Comparator should be a template method as well, since that one is even more perf-critical? Sure, you can try that out. To be clear, I am only asking: Are virtual calls expensive enough here that not having them, or amortizing their cost, something that makes the performance better? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2138061151 From epeter at openjdk.org Tue Jun 10 14:44:48 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 14:44:48 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v7] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: reorder flags for Christian ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/97af8205..ffc54f6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=05-06 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From epeter at openjdk.org Tue Jun 10 14:44:48 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 14:44:48 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: On Tue, 10 Jun 2025 10:13:25 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases > > Great effort and analysis! I have some comments/questions. @chhagedorn I think I addressed all your concerns. The only question remaining is if we should have the "reproducers" in the code comments. Let's see what @TobiHartmann says. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22970#issuecomment-2959528486 From epeter at openjdk.org Tue Jun 10 15:02:31 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 15:02:31 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: On Tue, 10 Jun 2025 13:10:04 GMT, Emanuel Peter wrote: >> Okay, I only looked at it from a user-perspective that you might mismatch the description to the value passed to the flag. What could help here is reversing the order you mention the modes: first D:, then C: etc. > > @chhagedorn If I mention `D` on the same line as `=DCBA, with `, then we have to change 2 lines next time. > > But I suppose we will have to change all lines anyway because the indentation would change... > > What I would really want to avoid is to have to change the parsing. So the lowest significant bits have to stay where they are, but I can rename them. > > Why don't you make a suggestion how you would like it to look, and then I can apply it :) I updated it to what we discussed offline :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2138136742 From rvansa at openjdk.org Tue Jun 10 15:18:38 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 15:18:38 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Tue, 10 Jun 2025 14:30:06 GMT, Johan Sj?len wrote: >> But then you'd need create an array of `Pair` in `create_search_table` and copy the data into that. You wouldn't need a Supplier at all, just pass the array and its length to the `fill` method. If you are worried about the virtual calls, I could do that. >> >> Alternatively, we could replace virtual calls with templated `fill` method. In that case the Comparator should be a template method as well, since that one is even more perf-critical? > > I think you misunderstand me, as what I'm proposing wouldn't requre creating an array in `create_search_table`. > > I'm saying that you do this: > > ```c++ > // Note: we require the supplier to provide the elements in the final order as we can't easily sort > // within this method - qsort() accepts only pure function as comparator. > void PackedTableBuilder::fill(u1* table, size_t table_length, Supplier &supplier) const { > Pair kvs[4]; > > size_t offset = 0; > size_t len_read = 0; > while (len = supplier.next(kvs, 4)) { > // len tells you how many of the full capacity of kvs was used. > } > } > > > Now each call of `Supplier::next` gives you up to 4 elements, quartering the amount of virtual calls necessary. > >>Alternatively, we could replace virtual calls with templated fill method. In that case the Comparator should be a template method as well, since that one is even more perf-critical? > > Sure, you can try that out. > > To be clear, I am only asking: Are virtual calls expensive enough here that not having them, or amortizing their cost, something that makes the performance better? The earlier version of this PR was not abstracting out PackedTable* so there were no virtual calls. In both the CCC.java and undisclosed customer reproducer I don't see a significant difference in performance - these involve whole JVM startup, so the efficiency of code with linear complexity is probably under the radar. If we want to optimize the hack out of it - yes, there would be space for that, maybe at the cost of maintainability and/or reusability. TLDR: I don't have a (micro)benchmark that would prove one thing is better than the other. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2138174164 From epeter at openjdk.org Tue Jun 10 15:38:31 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 10 Jun 2025 15:38:31 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v3] In-Reply-To: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> References: <3XjHLWBEYmn-1otPBnpKH3xLz100BR17x9_rGMAlQus=.e2331b5b-3197-407c-97bf-857ea0bd951c@github.com> Message-ID: On Tue, 10 Jun 2025 10:13:25 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases > > Great effort and analysis! I have some comments/questions. @chhagedorn I checked with @TobiHartmann : he said he does not have a strong opinion, but if he had to make a decision, he would prefers having everything in the comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22970#issuecomment-2959743354 From jsjolen at openjdk.org Tue Jun 10 15:47:44 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 10 Jun 2025 15:47:44 GMT Subject: RFR: 8352075: Perf regression accessing fields [v28] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <8c1fQTBShvspmR9rxAag6KBqGfLScIAM_fnxObToXpw=.e9e29682-581d-4aeb-bf9f-0bb4538cccea@github.com> Message-ID: On Tue, 10 Jun 2025 15:15:34 GMT, Radim Vansa wrote: >> I think you misunderstand me, as what I'm proposing wouldn't requre creating an array in `create_search_table`. >> >> I'm saying that you do this: >> >> ```c++ >> // Note: we require the supplier to provide the elements in the final order as we can't easily sort >> // within this method - qsort() accepts only pure function as comparator. >> void PackedTableBuilder::fill(u1* table, size_t table_length, Supplier &supplier) const { >> Pair kvs[4]; >> >> size_t offset = 0; >> size_t len_read = 0; >> while (len = supplier.next(kvs, 4)) { >> // len tells you how many of the full capacity of kvs was used. >> } >> } >> >> >> Now each call of `Supplier::next` gives you up to 4 elements, quartering the amount of virtual calls necessary. >> >>>Alternatively, we could replace virtual calls with templated fill method. In that case the Comparator should be a template method as well, since that one is even more perf-critical? >> >> Sure, you can try that out. >> >> To be clear, I am only asking: Are virtual calls expensive enough here that not having them, or amortizing their cost, something that makes the performance better? > > The earlier version of this PR was not abstracting out PackedTable* so there were no virtual calls. In both the CCC.java and undisclosed customer reproducer I don't see a significant difference in performance - these involve whole JVM startup, so the efficiency of code with linear complexity is probably under the radar. If we want to optimize the hack out of it - yes, there would be space for that, maybe at the cost of maintainability and/or reusability. > TLDR: I don't have a (micro)benchmark that would prove one thing is better than the other. Alright, if you can't observe a difference then don't change a thing :-). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2138248072 From ayang at openjdk.org Tue Jun 10 17:47:47 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Jun 2025 17:47:47 GMT Subject: RFR: 8358294: Remove unnecessary GenAlignment [v2] In-Reply-To: References: Message-ID: > Simple replacement of `GenAlignment` with `SpaceAlignment`, because they always have the same value. Removing the former to reduce complexity. > > Test: tier1-3 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into remove-gen-alignment - remove-gen-alignment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25577/files - new: https://git.openjdk.org/jdk/pull/25577/files/987f3d51..d21a02e3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25577&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25577&range=00-01 Stats: 29666 lines in 802 files changed: 23009 ins; 4069 del; 2588 mod Patch: https://git.openjdk.org/jdk/pull/25577.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25577/head:pull/25577 PR: https://git.openjdk.org/jdk/pull/25577 From iklam at openjdk.org Tue Jun 10 18:34:00 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 10 Jun 2025 18:34:00 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v8] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400810} 'baz2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.baz2(ExceptionsTest.java:142) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:135) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar2" at BCI: 6 > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 8 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:137) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400670} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions ] Found m... Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora comments -- removed printing of output.getStdout() from test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/cc451e7e..10e94797 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=06-07 Stats: 24 lines in 1 file changed: 7 ins; 12 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From iklam at openjdk.org Tue Jun 10 18:36:34 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 10 Jun 2025 18:36:34 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v4] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Tue, 10 Jun 2025 12:38:43 GMT, David Holmes wrote: >> The output is printed only if the failure happens in `OutputAnalyzer::shouldMatch()`, etc. I've updated the test so failures can happen other ways, so I cannot rely on this anymore. > > Sorry I don't follow. I'm saying there is no need to print stdout here because upon failure it will be printed anyway. If you want to see stdout even in passing cases then do this after all the `shouldXXX` checks. I fixed the test case to do all checks with `shouldXXX()`, and removed the stdout printing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2138546558 From iklam at openjdk.org Tue Jun 10 18:57:35 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 10 Jun 2025 18:57:35 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v7] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> <5hRU2QAv0w4NNzqUzIzTcz9YwJBPWg0MCCPifqHe_V8=.a7053f19-6303-42f2-88c5-7609fd083092@github.com> Message-ID: <1G_TnUgW4ldNxNY7JM7bvEaVz3m0Ij-wT-RqZqqLOCA=.af511907-c067-4f2d-a128-5ee82fa951d8@github.com> On Tue, 10 Jun 2025 12:43:12 GMT, David Holmes wrote: > I'm not sure about the de-duplication checks. It isn't clear to me what code structures will cause you see to see a single stacktrace and which will cause you to see many stacktraces. For example, will a try/finally causes a repeated stacktrace? I think yes. Yes, because a `finally` block looks like a `catch` block that has an `athrow` bytecode at the end. So with the latest version, the stack trace will be duplicated. >From looking at javac output, it looks like `finally` blocks have this pattern: try { b(); } finally { ... } catch t0 #0; // <--- note that exception type is #0 stack_frame_type stack1; stack_map class java/lang/Throwable; .... body of finally ... aload_0; athrow; // <-- last bytecode is athrow Whereas a `catch` block with an explicit `throw` statement at the end has a different pattern: try { b(); } catch (Throwable t) { ... throw t; } catch t0 java/lang/Throwable;; // <--- note that exception type is NOT #0 stack_frame_type stack1; stack_map class java/lang/Throwable; .... body of catch ... aload_0; athrow; So perhaps we can avoid printing stack traces if we can detect the `finally` blocks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25522#issuecomment-2960295097 From coleenp at openjdk.org Tue Jun 10 19:11:44 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 10 Jun 2025 19:11:44 GMT Subject: RFR: 8352075: Perf regression accessing fields [v29] In-Reply-To: <-T7PM80uoFrT7wol7RWu6ufJefRPZ-cWuW-qi646dHM=.996db406-7246-49b2-951a-603512194e34@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <-T7PM80uoFrT7wol7RWu6ufJefRPZ-cWuW-qi646dHM=.996db406-7246-49b2-951a-603512194e34@github.com> Message-ID: On Tue, 10 Jun 2025 09:30:31 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with two additional commits since the last revision: > > - Remove __builtin_memcpy > - Fix coding style These files: classFileParser.hpp, fieldStreams.hpp, fieldStreams.inline.hpp and unsigned5.hpp need a copyright update. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2960325865 From cjplummer at openjdk.org Tue Jun 10 19:37:29 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 10 Jun 2025 19:37:29 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter In-Reply-To: <3e2U0lztN76iKtiYtQaDlsPTknjRuubBJCq1azuj1Lk=.0a5e463b-9012-4a03-9a69-b2b0b0c6c601@github.com> References: <3e2U0lztN76iKtiYtQaDlsPTknjRuubBJCq1azuj1Lk=.0a5e463b-9012-4a03-9a69-b2b0b0c6c601@github.com> Message-ID: On Tue, 10 Jun 2025 11:03:42 GMT, Alan Bateman wrote: >> The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. >> I treat this as a bug and doubt we need a CSR for this issue. >> >> Testing: N/A > > src/hotspot/share/prims/jvmti.xml line 12874: > >> 12872: be reset by one of those native methods. >> 12873: Similarly, exceptions that are reported as uncaught (catch_method >> 12874: and catch_location set to 0) may in fact be caught by native code. > > catch_method is a jmethodID so null if uncaught. Then you also need to fix: "If there is no such catch clause, each field is set to 0." Also, technically speaking, can't `catch_location` be 0 even if caught (caught first the bytecode of the method)? Although I doubt javac would ever generate such code, it seems it is allowed. If so, then `catch_method == null` is the only check a user should make. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25710#discussion_r2138638748 From cjplummer at openjdk.org Tue Jun 10 19:42:29 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 10 Jun 2025 19:42:29 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v5] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Tue, 10 Jun 2025 06:39:39 GMT, Matthias Baesken wrote: > Hi David/Chris/Lucy - I added the comments to the tests. I think the `@comment` should be before the `@requires`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2960393458 From rvansa at openjdk.org Tue Jun 10 19:44:03 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Tue, 10 Jun 2025 19:44:03 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Copyright update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/14f9bdf7..36510e22 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=29 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=28-29 Stats: 4 lines in 4 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From ayang at openjdk.org Tue Jun 10 20:13:32 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Jun 2025 20:13:32 GMT Subject: RFR: 8358294: Remove unnecessary GenAlignment [v2] In-Reply-To: References: Message-ID: <4-LjRq2jPjNeeH4KZq4NYgzkIM8QM3K-a9fz0b67IJY=.de37d756-35cd-4b00-b80f-bac5fe861f42@github.com> On Tue, 10 Jun 2025 17:47:47 GMT, Albert Mingkun Yang wrote: >> Simple replacement of `GenAlignment` with `SpaceAlignment`, because they always have the same value. Removing the former to reduce complexity. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into remove-gen-alignment > - remove-gen-alignment Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25577#issuecomment-2960461165 From ayang at openjdk.org Tue Jun 10 20:13:33 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 10 Jun 2025 20:13:33 GMT Subject: Integrated: 8358294: Remove unnecessary GenAlignment In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 08:36:08 GMT, Albert Mingkun Yang wrote: > Simple replacement of `GenAlignment` with `SpaceAlignment`, because they always have the same value. Removing the former to reduce complexity. > > Test: tier1-3 This pull request has now been integrated. Changeset: 38b877e9 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/38b877e941918cc5f0463b256d4672d765d40302 Stats: 105 lines in 16 files changed: 0 ins; 46 del; 59 mod 8358294: Remove unnecessary GenAlignment Reviewed-by: iwalulya, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/25577 From sspitsyn at openjdk.org Tue Jun 10 21:15:28 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 10 Jun 2025 21:15:28 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter In-Reply-To: References: <3e2U0lztN76iKtiYtQaDlsPTknjRuubBJCq1azuj1Lk=.0a5e463b-9012-4a03-9a69-b2b0b0c6c601@github.com> Message-ID: <1_ZzX5NOC8YIn_oKpPBpI62WMNVdVI8JZLMwqpif0Vs=.d4840bcd-80ae-4b70-8b04-ab2424eb7ba5@github.com> On Tue, 10 Jun 2025 19:35:16 GMT, Chris Plummer wrote: >> src/hotspot/share/prims/jvmti.xml line 12874: >> >>> 12872: be reset by one of those native methods. >>> 12873: Similarly, exceptions that are reported as uncaught (catch_method >>> 12874: and catch_location set to 0) may in fact be caught by native code. >> >> catch_method is a jmethodID so null if uncaught. > > Then you also need to fix: > > "If there is no such catch clause, each field is set to 0." > > Also, technically speaking, can't `catch_location` be 0 even if caught (caught first the bytecode of the method)? Although I doubt javac would ever generate such code, it seems it is allowed. If so, then `catch_method == null` is the only check a user should make. Thank you for the comments! Yes, I've also come to the same conclusion about the only `catch_method == null` check. > Then you also need to fix: > "If there is no such catch clause, each field is set to 0." Good catch, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25710#discussion_r2138777360 From sspitsyn at openjdk.org Tue Jun 10 21:28:07 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 10 Jun 2025 21:28:07 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v2] In-Reply-To: References: Message-ID: > The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. > I treat this as a bug and doubt we need a CSR for this issue. > > Testing: N/A Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: 1) only check for catch_method != null 2) replace field with parameter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25710/files - new: https://git.openjdk.org/jdk/pull/25710/files/ba775955..021756c5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25710&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25710&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25710/head:pull/25710 PR: https://git.openjdk.org/jdk/pull/25710 From cjplummer at openjdk.org Tue Jun 10 21:39:27 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 10 Jun 2025 21:39:27 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v2] In-Reply-To: References: Message-ID: <7lNHdHcyFSaz7pzv_DK7Et_OCv7UrNptxBI3UsvmIkA=.c89d8e57-9de1-4360-9e8e-2d7616125c88@github.com> On Tue, 10 Jun 2025 21:28:07 GMT, Serguei Spitsyn wrote: >> The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. >> I treat this as a bug and doubt we need a CSR for this issue. >> >> Testing: N/A > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: 1) only check for catch_method != null 2) replace field with parameter Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25710#pullrequestreview-2915094367 From duke at openjdk.org Wed Jun 11 03:41:33 2025 From: duke at openjdk.org (duke) Date: Wed, 11 Jun 2025 03:41:33 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: <7YAkiTl9Xks9NwaPnICWkC78V4DtdNa1w0H0H7xeqDA=.4748aed9-22b7-4730-a7ce-a6a7591c86ab@github.com> On Thu, 5 Jun 2025 07:15:34 GMT, Liming Liu wrote: >> This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. >> >> 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. >> >> 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. >> >> The performance regressions and improvements were measured with the following microbenchmarks: >> org.openjdk.bench.java.util.TestCRC32.testCRC32Update >> org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate >> >> Ran the following JTReg tests on Ampere1 and did not find problems: >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java > > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Add the message for the assertions @limingliu-ampere Your change (at version df9f920a4279b5be28ea5e4bf6977f907ae50b3c) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25609#issuecomment-2961147642 From amitkumar at openjdk.org Wed Jun 11 05:04:27 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 11 Jun 2025 05:04:27 GMT Subject: RFR: 8359114: [s390x] Add z17 detection code In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 10:48:48 GMT, Manjunath S Matti. wrote: > Add support to detect the new generation of Z machine (z17). src/hotspot/cpu/s390/vm_version_s390.hpp line 124: > 122: // ---------------------------------------------- > 123: #define BEAREnhFacilityMask 0x4000000000000000UL // z16, BEAR-enhancement facility, Bit: 193 > 124: #define ConcurrentFunFacilityMask 0x0040000000000000UL // z17, Concurrent-functions facility, Bit: 201 I think above comment in Line 121 is incorrect about bits covered in DW[3] and seems like comment about DW[2] is also incorrect. Can you update those as well. DW[2] should cover bits: 128?191 DW[3] should cover bits: 192?255 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25718#discussion_r2139194137 From alanb at openjdk.org Wed Jun 11 06:03:46 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 11 Jun 2025 06:03:46 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v2] In-Reply-To: <1_ZzX5NOC8YIn_oKpPBpI62WMNVdVI8JZLMwqpif0Vs=.d4840bcd-80ae-4b70-8b04-ab2424eb7ba5@github.com> References: <3e2U0lztN76iKtiYtQaDlsPTknjRuubBJCq1azuj1Lk=.0a5e463b-9012-4a03-9a69-b2b0b0c6c601@github.com> <1_ZzX5NOC8YIn_oKpPBpI62WMNVdVI8JZLMwqpif0Vs=.d4840bcd-80ae-4b70-8b04-ab2424eb7ba5@github.com> Message-ID: <1WQwUmMEMQNcadwD_-27loMxuOwCppkF7-tWKqoSbkw=.59d255ad-9dc1-4a7a-83d6-bf972064e69a@github.com> On Tue, 10 Jun 2025 21:13:11 GMT, Serguei Spitsyn wrote: >> Then you also need to fix: >> >> "If there is no such catch clause, each field is set to 0." >> >> Also, technically speaking, can't `catch_location` be 0 even if caught (caught first the bytecode of the method)? Although I doubt javac would ever generate such code, it seems it is allowed. If so, then `catch_method == null` is the only check a user should make. > > Thank you for the comments! > Yes, I've also come to the same conclusion about the only `catch_method == null` check. > >> Then you also need to fix: >> "If there is no such catch clause, each field is set to 0." > > Good catch, thanks. > > The suggestions above have addressed now. > Also, I've replaced the term `field` with `parameter` for consistency in two spots. I think what you have looks okay now. I'm just wondering about the description for the catch_location parameter has "zero if no known catch". So if catch_method is null then catch_location must be 0. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25710#discussion_r2139275112 From mbaesken at openjdk.org Wed Jun 11 06:31:20 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 11 Jun 2025 06:31:20 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v7] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Move test comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/3f64dffe..ea831d95 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=05-06 Stats: 12 lines in 6 files changed: 6 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Wed Jun 11 06:31:20 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 11 Jun 2025 06:31:20 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v5] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <_tH2_EGdEhvKQ_kpUR5VmamiHrB3Kz7BJJ1bn-1RGRw=.b82617f8-7fb0-4736-922f-ff807982d32f@github.com> On Tue, 10 Jun 2025 19:39:33 GMT, Chris Plummer wrote: > > I think the `@comment` should be before the `@requires`. Yes it seems at most places the comment is in front. I moved the comments . ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2961406162 From stefank at openjdk.org Wed Jun 11 08:38:32 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 11 Jun 2025 08:38:32 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: References: Message-ID: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> On Fri, 30 May 2025 08:24:42 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static size_t available_memory(); >> static julong used_memory(); --> static size_t used_memory(); >> static julong free_memory(); --> static size_t free_memory(); >> static jlong total_swap_space(); --> static ptrdiff_t total_swap_space(); >> static jlong free_swap_space(); --> static ptrdiff_t free_swap_space(); >> static julong physical_memory(); --> static size_t physical_memory(); >> >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Fixed spaces in formatting in gc-related code. Are you up for making an experiment that changes `total_swap_space` and `free_swap_space` to return two values: one the actual value in `size_t` and the other an error code that gets set whenever we hit an error? The two proposed alternatives for this would be: 1) returning something containing the two values (struct, Pair, Tuple, array, ...) 2) Return the size_t and using an an out put parameter to signal an error (or vice versa) I think the fan-out of that will not be too bad and the likely outcome is that it is clearer that the code is propagating errors. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25450#pullrequestreview-2916176235 From aph at openjdk.org Wed Jun 11 08:42:32 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 11 Jun 2025 08:42:32 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v2] In-Reply-To: <5BGf1eIVeMQIaLXIoOvcuQlBiaPeWojv8HAnfuOiW_E=.8c39a6ac-c0f2-40fe-bef3-be0a6bd71c07@github.com> References: <5BGf1eIVeMQIaLXIoOvcuQlBiaPeWojv8HAnfuOiW_E=.8c39a6ac-c0f2-40fe-bef3-be0a6bd71c07@github.com> Message-ID: <92tLsaNGJp5FfkSgIYjJTQs10O86YxU5tt_FdDk_ipY=.b06e132f-f051-4df1-89f0-8aef6163ce9c@github.com> On Wed, 4 Jun 2025 08:30:55 GMT, Emanuel Peter wrote: > Generally looks reasonable to me as a non expert in crypto intrinsics. But we definitively need an expert to approve this in the end. I have a few comments below. I'm happy enough, and it seems also to improvements on Apple silicon. However, the title is misleading: while this PR may have started as something purely Ampere-specific, it no longer is. Something like "AArch64: CRC32/CRC32 enhancements" is perhaps rather vague, but at least it's true. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25609#issuecomment-2961746523 From wenanjian at openjdk.org Wed Jun 11 09:21:14 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Wed, 11 Jun 2025 09:21:14 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls Message-ID: backport of acquire fence removal in safepoint poll during JNI calls as aarch64 At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we use acquire fence. Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence. ------------- Commit messages: - RISC-V: No need for acquire fence in safepoint poll during JNI calls Changes: https://git.openjdk.org/jdk/pull/25709/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25709&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359105 Stats: 18 lines in 2 files changed: 0 ins; 14 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25709.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25709/head:pull/25709 PR: https://git.openjdk.org/jdk/pull/25709 From wenanjian at openjdk.org Wed Jun 11 09:42:27 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Wed, 11 Jun 2025 09:42:27 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 06:33:34 GMT, Anjian Wen wrote: > backport of acquire fence removal in safepoint poll during JNI calls as aarch64[0] > > At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we use acquire fence. > > Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in > JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence. > > [0] https://github.com/openjdk/jdk/pull/20420 @robehn Hi, Can you help to review the patch? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25709#issuecomment-2961953614 From wenanjian at openjdk.org Wed Jun 11 09:49:07 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Wed, 11 Jun 2025 09:49:07 GMT Subject: RFR: 8359218: RISC-V: Only enable CRC32 intrinsic when AvoidUnalignedAccess == false Message-ID: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> When test **Specjvm** in p550, we can find the compress test result shown below. before patch -XX:-UseCompactObjectHeaders Warmup (30s) begins: Wed Jun 11 16:10:18 CST 2025 Warmup (30s) ends: Wed Jun 11 16:10:53 CST 2025 Warmup (30s) result: 68.98 ops/m Iteration 1 (60s) begins: Wed Jun 11 16:10:53 CST 2025 Iteration 1 (60s) ends: Wed Jun 11 16:11:57 CST 2025 Iteration 1 (60s) result: 71.25 ops/m -XX:+UseCompactObjectHeaders Warmup (30s) begins: Wed Jun 11 16:13:03 CST 2025 Warmup (30s) ends: Wed Jun 11 16:13:42 CST 2025 Warmup (30s) result: 31.87 ops/m Iteration 1 (60s) begins: Wed Jun 11 16:13:42 CST 2025 Iteration 1 (60s) ends: Wed Jun 11 16:14:56 CST 2025 Iteration 1 (60s) result: 29.13 ops/m The reason is that when enable compactObjectHeaders, the arraylist header turn to 4 Byte, but the copy instruction used in CRC intrinsic loop is 8 Byte one loop, which may reduce the performance when hardware is sensitive to unaligned access, we can close it when AvoidUnalignedAccesses is true after patch -XX:-UseCompactObjectHeaders Warmup (30s) begins: Wed Jun 11 16:23:22 CST 2025 Warmup (30s) ends: Wed Jun 11 16:23:57 CST 2025 Warmup (30s) result: 68.61 ops/m Iteration 1 (60s) begins: Wed Jun 11 16:23:57 CST 2025 Iteration 1 (60s) ends: Wed Jun 11 16:25:00 CST 2025 Iteration 1 (60s) result: 71.57 ops/m -XX:+UseCompactObjectHeaders Warmup (30s) begins: Wed Jun 11 16:25:28 CST 2025 Warmup (30s) ends: Wed Jun 11 16:26:03 CST 2025 Warmup (30s) result: 68.36 ops/m Iteration 1 (60s) begins: Wed Jun 11 16:26:03 CST 2025 Iteration 1 (60s) ends: Wed Jun 11 16:27:08 CST 2025 Iteration 1 (60s) result: 70.85 ops/m ------------- Commit messages: - RISC-V: Only enable CRC32 intrinsic when AvoidUnalignedAccess == false Changes: https://git.openjdk.org/jdk/pull/25743/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25743&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359218 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25743/head:pull/25743 PR: https://git.openjdk.org/jdk/pull/25743 From duke at openjdk.org Wed Jun 11 09:53:17 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 11 Jun 2025 09:53:17 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v5] In-Reply-To: References: Message-ID: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static size_t available_memory(); > static julong used_memory(); --> static size_t used_memory(); > static julong free_memory(); --> static size_t free_memory(); > static jlong total_swap_space(); --> static ptrdiff_t total_swap_space(); > static jlong free_swap_space(); --> static ptrdiff_t free_swap_space(); > static julong physical_memory(); --> static size_t physical_memory(); > > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs - 8357086: Fixed spaces in formatting in gc-related code. - 8357086: Fixed formatting. - 8357086: Addressed reviewer's comments. - 8357086: More work. - 8357086: More work. - 8357086: More work. - 8357086: More work. - 8357086: More work. - 8357086: More work - ... and 2 more: https://git.openjdk.org/jdk/compare/f7c8a5a9...f3a5f61c ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/f8a9d608..f3a5f61c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=03-04 Stats: 87128 lines in 1504 files changed: 53638 ins; 21091 del; 12399 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From fyang at openjdk.org Wed Jun 11 10:49:30 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 11 Jun 2025 10:49:30 GMT Subject: RFR: 8359218: RISC-V: Only enable CRC32 intrinsic when AvoidUnalignedAccess == false In-Reply-To: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> References: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> Message-ID: On Wed, 11 Jun 2025 08:47:49 GMT, Anjian Wen wrote: > When test **Specjvm** in p550, we can find the compress test result shown below. > > > before patch > -XX:-UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:10:18 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:10:53 CST 2025 > Warmup (30s) result: 68.98 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:10:53 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:11:57 CST 2025 > Iteration 1 (60s) result: 71.25 ops/m > > > -XX:+UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:13:03 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:13:42 CST 2025 > Warmup (30s) result: 31.87 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:13:42 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:14:56 CST 2025 > Iteration 1 (60s) result: 29.13 ops/m > > > The reason is that when enable compactObjectHeaders, the arraylist header turn to 4 Byte, but the copy instruction used in CRC intrinsic loop is 8 Byte one loop, which may reduce the performance when hardware is sensitive to unaligned access, we can close it when AvoidUnalignedAccesses is true > > after patch > > -XX:-UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:23:22 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:23:57 CST 2025 > Warmup (30s) result: 68.61 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:23:57 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:25:00 CST 2025 > Iteration 1 (60s) result: 71.57 ops/m > > > -XX:+UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:25:28 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:26:03 CST 2025 > Warmup (30s) result: 68.36 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:26:03 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:27:08 CST 2025 > Iteration 1 (60s) result: 70.85 ops/m Thanks for finding this! ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25743#pullrequestreview-2916614388 From fjiang at openjdk.org Wed Jun 11 11:01:29 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Wed, 11 Jun 2025 11:01:29 GMT Subject: RFR: 8359218: RISC-V: Only enable CRC32 intrinsic when AvoidUnalignedAccess == false In-Reply-To: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> References: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> Message-ID: On Wed, 11 Jun 2025 08:47:49 GMT, Anjian Wen wrote: > When test **Specjvm** in p550, we can find the compress test result shown below. > > > before patch > -XX:-UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:10:18 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:10:53 CST 2025 > Warmup (30s) result: 68.98 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:10:53 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:11:57 CST 2025 > Iteration 1 (60s) result: 71.25 ops/m > > > -XX:+UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:13:03 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:13:42 CST 2025 > Warmup (30s) result: 31.87 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:13:42 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:14:56 CST 2025 > Iteration 1 (60s) result: 29.13 ops/m > > > The reason is that when enable compactObjectHeaders, the arraylist header turn to 4 Byte, but the copy instruction used in CRC intrinsic loop is 8 Byte one loop, which may reduce the performance when hardware is sensitive to unaligned access, we can close it when AvoidUnalignedAccesses is true > > after patch > > -XX:-UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:23:22 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:23:57 CST 2025 > Warmup (30s) result: 68.61 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:23:57 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:25:00 CST 2025 > Iteration 1 (60s) result: 71.57 ops/m > > > -XX:+UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:25:28 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:26:03 CST 2025 > Warmup (30s) result: 68.36 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:26:03 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:27:08 CST 2025 > Iteration 1 (60s) result: 70.85 ops/m Looks fine. ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/25743#pullrequestreview-2916648187 From ayang at openjdk.org Wed Jun 11 11:01:23 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 11 Jun 2025 11:01:23 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v12] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <8ydE-ymScXytYj65qVxSOmFikkvS90bH5NcvLsfAa7c=.b410d110-0db6-4750-8603-d2de21436263@github.com> > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: - merge - version - Merge branch 'master' into pgc-size-policy - revert-aliases - Merge branch 'master' into pgc-size-policy - merge - merge-fix - merge - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - ... and 7 more: https://git.openjdk.org/jdk/compare/56ce70c5...8689b54c ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=11 Stats: 4373 lines in 31 files changed: 522 ins; 3452 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From cjplummer at openjdk.org Wed Jun 11 11:24:29 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 11 Jun 2025 11:24:29 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v6] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Tue, 10 Jun 2025 06:42:46 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comments to tests test/hotspot/jtreg/serviceability/dcmd/vm/SystemMapTest.java line 52: > 50: * @library /test/lib > 51: * @requires vm.gc.Z & (os.family == "linux" | os.family == "windows" | os.family == "mac") > 52: * @requires !vm.asan You should duplicate the comment here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25575#discussion_r2138641241 From coleenp at openjdk.org Wed Jun 11 11:32:36 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 11 Jun 2025 11:32:36 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: <5nq6TlYli8aYrxOrwRWy5LFM2vvoc21L1PxjS24x4dw=.25030547-4def-4afe-9630-1d7e11fe687c@github.com> On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update I do not have any further comments. This change looks good and I believe this is needed to resolve the performance problem. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2916760401 From mbaesken at openjdk.org Wed Jun 11 11:42:13 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 11 Jun 2025 11:42:13 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v8] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Add comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/ea831d95..9aa58582 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=06-07 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From thartmann at openjdk.org Wed Jun 11 11:59:41 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 11 Jun 2025 11:59:41 GMT Subject: RFR: 8359200: Memory corruption in MStack::push Message-ID: I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed, i.e., in this case it will only grow the stack if there's <= 1 empty slot: https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? Thanks, Tobias ------------- Commit messages: - Trailing whitespace - 8359200: Memory corruption in MStack::push Changes: https://git.openjdk.org/jdk/pull/25751/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25751&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359200 Stats: 94 lines in 8 files changed: 73 ins; 14 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25751/head:pull/25751 PR: https://git.openjdk.org/jdk/pull/25751 From mchevalier at openjdk.org Wed Jun 11 12:11:30 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Wed, 11 Jun 2025 12:11:30 GMT Subject: RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 11:49:08 GMT, Tobias Hartmann wrote: > I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 > > But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed, i.e., in this case it will only grow the stack if there's <= 1 empty slot: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 > > However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. > > I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. > > I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. > > @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? > > Thanks, > Tobias There is something I don't understand: > in this case it will only grow the stack if there's <= 1 empty slot and > if there's one empty slot, the stack will not be grown But one empty slot <= 1 empty slot, so, should it grow? But the code look like it grows if there is 0 empty slots, but I haven't read much of it yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25751#issuecomment-2962432873 From thartmann at openjdk.org Wed Jun 11 12:15:30 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 11 Jun 2025 12:15:30 GMT Subject: RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 11:49:08 GMT, Tobias Hartmann wrote: > I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 > > But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 > > However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. > > I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. > > I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. > > @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? > > Thanks, > Tobias Thanks for looking at this Marc. That sentence did indeed not make much sense, I fixed it. Before [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), the grow method would always grow and after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999) it would only grow if there's no space left. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25751#issuecomment-2962448571 From rehn at openjdk.org Wed Jun 11 12:17:29 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 11 Jun 2025 12:17:29 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 06:33:34 GMT, Anjian Wen wrote: > backport of acquire fence removal in safepoint poll during JNI calls as aarch64[0] > > At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. > > Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in > JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence. > > [0] https://github.com/openjdk/jdk/pull/20420 Hey, sure. The description says "backport of", as you are changing master I think that description is wrong. The downcallLinker_riscv.cpp is the same case as the native transition. So I don't see why you would keep that acquire ? AFIACT no one should use acquire, thus this should then mean that we can remove "bool acquire" argument from safepoint_poll(). That arm still have acquire in their downlinker seems like an oversight? 8337657 only have one reviewer, I think it should have been cought there. (please note https://wiki.openjdk.org/display/HotSpot/Pushing+a+HotSpot+change, two reviewers required) @dchuyko can you open a new issue and look the acquire in downcall linker for aarch64 ? Thanks, Robbin ------------- PR Comment: https://git.openjdk.org/jdk/pull/25709#issuecomment-2962456420 From kbarrett at openjdk.org Wed Jun 11 12:25:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 11 Jun 2025 12:25:32 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> References: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> Message-ID: On Wed, 11 Jun 2025 08:35:30 GMT, Stefan Karlsson wrote: > Are you up for making an experiment that changes `total_swap_space` and `free_swap_space` to return two values: one the actual value in `size_t` and the other an error code that gets set whenever we hit an error? > > The two proposed alternatives for this would be: > > 1. returning something containing the two values (struct, Pair, Tuple, array, ...) > > 2. Return the size_t and using an an out put parameter to signal an error (or vice versa) > > > I think the fan-out of that will not be too bad and the likely outcome is that it is clearer that the code is propagating errors. I think the concern about the documented range for ssize_t is perhaps overblown. I think that specification is driven in part by continuing support for non-two's-complement. I think that for any platform we're going to support, ssize_t is equivalent to std::make_unsigned_t. We could ensure that with a static_assert somewhere. That some functions use -1 for a non-error and perhaps some other negative value to indicate an error seems not particularly different to me than that some functions return an error code (usually an int, so who knows whether positive or negative). So I would probably be okay with using ssize_t. I'm not generally a fan of out parameters. I think there are examples in HotSpot of functions that return a dedicated struct containing a result and an error indication. The standard uses std::pair for that sort of thing (until C++23's std::expected, which is likely a long way off for us). I prefer the dedicated struct approach, as it provides meaningful names. That's also why I'm not a fan of tuples or arrays for this purpose. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2962482033 From duke at openjdk.org Wed Jun 11 12:30:30 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 11 Jun 2025 12:30:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> References: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> Message-ID: On Wed, 11 Jun 2025 08:35:30 GMT, Stefan Karlsson wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8357086: Fixed spaces in formatting in gc-related code. > > Are you up for making an experiment that changes `total_swap_space` and `free_swap_space` to return two values: one the actual value in `size_t` and the other an error code that gets set whenever we hit an error? > > The two proposed alternatives for this would be: > 1) returning something containing the two values (struct, Pair, Tuple, array, ...) > 2) Return the size_t and using an an out put parameter to signal an error (or vice versa) > > I think the fan-out of that will not be too bad and the likely outcome is that it is clearer that the code is propagating errors. @stefank I inspected the container-related code once again, and came to conclusion that it is safe to use ssize_t, as you suggested above initially. The `OSCONTAINER_ERROR` return value will not be returned by `free_swap_space()` in os_linux.cpp as well as in anywhere in that file, because in all places there is a check for non-negativity. If negative, then the flow falls back to the host method, which can return value >= -1, i.e. fitting `ssize_t`. I will push these changes soon once tested. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2962496029 From kbarrett at openjdk.org Wed Jun 11 12:30:33 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 11 Jun 2025 12:30:33 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v5] In-Reply-To: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> References: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> Message-ID: <5gociSqv0IebuTsiWQKtOayx_7V9rX76OwTHQZtz_8Y=.218fb7b7-c925-4f61-ad88-6637840a7b73@github.com> On Wed, 11 Jun 2025 09:53:17 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static size_t available_memory(); >> static julong used_memory(); --> static size_t used_memory(); >> static julong free_memory(); --> static size_t free_memory(); >> static jlong total_swap_space(); --> static ptrdiff_t total_swap_space(); >> static jlong free_swap_space(); --> static ptrdiff_t free_swap_space(); >> static julong physical_memory(); --> static size_t physical_memory(); >> >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs > - 8357086: Fixed spaces in formatting in gc-related code. > - 8357086: Fixed formatting. > - 8357086: Addressed reviewer's comments. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work > - ... and 2 more: https://git.openjdk.org/jdk/compare/430aa5b8...f3a5f61c Windows doesn't seem to have ssize_t natively, so we define it and SSIZE_MAX and SSIZE_MIN ourselves (in globalDefinitions_visCPP.hpp), with the obvious definitions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2962496669 From mchevalier at openjdk.org Wed Jun 11 13:00:28 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Wed, 11 Jun 2025 13:00:28 GMT Subject: RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 11:49:08 GMT, Tobias Hartmann wrote: > I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 > > But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 > > However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. > > I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. > > I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. > > @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? > > Thanks, > Tobias I like the double push, it looks simpler, safer. Hopefully, the C++ compiler is smart enough to make that just as efficient. Overall, a lot of checks (like `_nesting.check(_set_arena);`) are moved from `grow` to `maybe_grow`, as stated in the description, but also things that looks more necessary like `if (i >= Max())` in `Block_Array::maybe_grow`. So, I suppose one shouldn't use `grow` directly, and you indeed changed `Block_Array::map` this way. But what about derived classes? `grow` is only protected in most (all?) of these classes, so some derived classes could call it. Is it worth making it private? Or maybe calling `grow` directly is not so wrong? Overall, it's not clear to me whether calling `grow` directly is wrong (and then not very discouraged, maybe making it private or commenting about not using it would help), or whether I just lack imagination and `grow` makes sense to be called not only through `maybe_grow`. And indeed, if `grow` must be called only through `maybe_grow` one could inline it, and yet, it's not (for classes that had both). Or maybe it's a well known thing not to do that, and I'm just inventing mistakes nobody would make in real life. Just asking! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25751#issuecomment-2962602947 From duke at openjdk.org Wed Jun 11 13:09:45 2025 From: duke at openjdk.org (Manjunath S Matti.) Date: Wed, 11 Jun 2025 13:09:45 GMT Subject: RFR: 8359114: [s390x] Add z17 detection code [v2] In-Reply-To: References: Message-ID: <_fiM-Nhm3q5S2hCxa3quxpodBRmeIsCIBcA7AB4Hmcc=.2005b23d-fc05-4821-90b4-cf22a8d2442e@github.com> > Add support to detect the new generation of Z machine (z17). Manjunath S Matti. has updated the pull request incrementally with one additional commit since the last revision: Correct the comments for the bits covered in DW[2] and DW[3]. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25718/files - new: https://git.openjdk.org/jdk/pull/25718/files/c4a81e7b..95c2fa9e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25718&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25718&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25718.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25718/head:pull/25718 PR: https://git.openjdk.org/jdk/pull/25718 From lucy at openjdk.org Wed Jun 11 13:28:31 2025 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 11 Jun 2025 13:28:31 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v8] In-Reply-To: <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> Message-ID: <9hnk0aeI-KGmzzxAAjkgjuuZT44nOxT73X3rK_86sso=.3b030902-1113-4d26-b9d5-6656732e15fb@github.com> On Wed, 11 Jun 2025 11:42:13 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Looks good with the nicely placed @comments. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2917152613 From duke at openjdk.org Wed Jun 11 14:59:51 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 11 Jun 2025 14:59:51 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v6] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static size_t available_memory(); > static julong used_memory(); --> static size_t used_memory(); > static julong free_memory(); --> static size_t free_memory(); > static jlong total_swap_space(); --> static ssize_t total_swap_space(); > static jlong free_swap_space(); --> static ssize_t free_swap_space(); > static julong physical_memory(); --> static size_t physical_memory(); > > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/f3a5f61c..00d60415 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=04-05 Stats: 22 lines in 5 files changed: 0 ins; 0 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From sgehwolf at openjdk.org Wed Jun 11 15:03:38 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Wed, 11 Jun 2025 15:03:38 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v5] In-Reply-To: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> References: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> Message-ID: On Wed, 11 Jun 2025 09:53:17 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static size_t available_memory(); >> static julong used_memory(); --> static size_t used_memory(); >> static julong free_memory(); --> static size_t free_memory(); >> static jlong total_swap_space(); --> static ssize_t total_swap_space(); >> static jlong free_swap_space(); --> static ssize_t free_swap_space(); >> static julong physical_memory(); --> static size_t physical_memory(); >> >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs > - 8357086: Fixed spaces in formatting in gc-related code. > - 8357086: Fixed formatting. > - 8357086: Addressed reviewer's comments. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work. > - 8357086: More work > - ... and 2 more: https://git.openjdk.org/jdk/compare/afaa2a88...f3a5f61c src/hotspot/os/linux/os_linux.cpp line 261: > 259: } > 260: log_trace(os)("available memory: " JULONG_FORMAT, avail_mem); > 261: return static_cast(avail_mem); Line 243 should probably receive the same treatment (of `static_cast`)? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2140360647 From sgehwolf at openjdk.org Wed Jun 11 15:03:34 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Wed, 11 Jun 2025 15:03:34 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v6] In-Reply-To: References: Message-ID: <7sJowvjYMrjixJMblk0uL8ACr6WXBO6ce2ySxKCk9ds=.10df8bbd-9f81-444a-88f6-3caba0cca270@github.com> On Wed, 11 Jun 2025 14:59:51 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static size_t available_memory(); >> static julong used_memory(); --> static size_t used_memory(); >> static julong free_memory(); --> static size_t free_memory(); >> static jlong total_swap_space(); --> static ssize_t total_swap_space(); >> static jlong free_swap_space(); --> static ssize_t free_swap_space(); >> static julong physical_memory(); --> static size_t physical_memory(); >> >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t +1 to the `ssize_t` change. Much better than the `ptrdiff_t` version. ------------- PR Review: https://git.openjdk.org/jdk/pull/25450#pullrequestreview-2917501538 From kbarrett at openjdk.org Wed Jun 11 15:08:33 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 11 Jun 2025 15:08:33 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: References: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> Message-ID: On Wed, 11 Jun 2025 12:23:13 GMT, Kim Barrett wrote: > for any platform we're going to support, ssize_t is equivalent to std::make_unsigned_t. Oops, I meant s/std::make_unsigned/std::make_signed/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2963196968 From mablakatov at openjdk.org Wed Jun 11 15:37:49 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 11 Jun 2025 15:37:49 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 10:04:03 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp line 106: >> >>> 104: } else { >>> 105: NativeJump::insert(method_holder->next_instruction_address(), entry); >>> 106: } >> >> Suggestion: >> >> MacroAssembler::pd_patch_instruction(method_holder->next_instruction_address(), entry); > > Please also delete `NativeGeneralJump::insert_unconditional`, which is no longer used. Done, thank you. See https://github.com/openjdk/jdk/pull/25702/commits/7ef1c4aec9531914bfae576e01a85edb7e197f1b ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2140503056 From mablakatov at openjdk.org Wed Jun 11 15:37:47 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 11 Jun 2025 15:37:47 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: > In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. > > This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. > > Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: > > | Metric | Before | After | Difference | > |-------------|---------------|---------------|------------| > | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | > | | Sum: 6653848 | Sum: 6616344 | -0.56% | > | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | > | | Sum: 364376 | Sum: 308552 | -15.33% | > > Full jtreg passed on AArch64. Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: address review comments: use pd_patch_instruction directly MacroAssembler::pd_patch_instruction can distinguish between the `b` and `movk movz movz br` sequences. Strictly speaking, the method patches not a single instruction but a semantically joint sequence of instructions. Use it directly instead of `NativeJump` and `NativeGeneralJump` wrapper classes to simplify the implementation and get rid of an extra icache invalidation. Other changes in the patch simply clean up code that became redundant. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25702/files - new: https://git.openjdk.org/jdk/pull/25702/files/a904f1c1..7ef1c4ae Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25702&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25702&range=00-01 Stats: 28 lines in 3 files changed: 0 ins; 25 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25702.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25702/head:pull/25702 PR: https://git.openjdk.org/jdk/pull/25702 From mablakatov at openjdk.org Wed Jun 11 15:37:49 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 11 Jun 2025 15:37:49 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 10:37:00 GMT, Andrew Haley wrote: >> Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: >> >> address review comments: use pd_patch_instruction directly >> >> MacroAssembler::pd_patch_instruction can distinguish between the `b` >> and `movk movz movz br` sequences. Strictly speaking, the method >> patches not a single instruction but a semantically joint sequence of >> instructions. Use it directly instead of `NativeJump` and >> `NativeGeneralJump` wrapper classes to simplify the implementation and >> get rid of an extra icache invalidation. >> >> Other changes in the patch simply clean up code that became redundant. > > src/hotspot/cpu/aarch64/compiledIC_aarch64.cpp line 106: > >> 104: } else { >> 105: NativeJump::insert(method_holder->next_instruction_address(), entry); >> 106: } > > We're also calling `ICache::invalidate_range` twice, which is kinda lame. That perhaps doesn't matter because calls to `CompiledDirectCall::set_to_interpreted()` are fairly rare. Agreed, thank you for spotting that. Should not be the case any longer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2140504282 From rvansa at openjdk.org Wed Jun 11 16:37:37 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 11 Jun 2025 16:37:37 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: <-XuFmpo5Oj6x-mKlpgSseo5jJUZdY9GXo2Hm17pue0I=.4da01a2c-2c37-4bb3-a038-467746ad1582@github.com> On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update Thank you! I'll mark this for integration and I would appreciate if I can get your sponsorship. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2963485542 From duke at openjdk.org Wed Jun 11 16:37:37 2025 From: duke at openjdk.org (duke) Date: Wed, 11 Jun 2025 16:37:37 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: <3dgCL2a6SXMjBzUYCWSPQefrzOKsmWWUZU1A61vLLJA=.5f044d95-02c7-4faa-9dcb-9d075ce33586@github.com> On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update @rvansa Your change (at version 36510e22b25f1792bc53f867c62b1f0f58a2c8fd) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2963487411 From iklam at openjdk.org Wed Jun 11 16:40:39 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 11 Jun 2025 16:40:39 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2917965617 From yzheng at openjdk.org Wed Jun 11 16:49:53 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 11 Jun 2025 16:49:53 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v5] In-Reply-To: References: Message-ID: > Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods. Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - address comments - Merge remote-tracking branch 'upstream/master' into JDK-8357424 - address comments - address comments - update copyright - [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25356/files - new: https://git.openjdk.org/jdk/pull/25356/files/b72213ae..57fe5307 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=03-04 Stats: 123326 lines in 1943 files changed: 80552 ins; 28041 del; 14733 mod Patch: https://git.openjdk.org/jdk/pull/25356.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356 PR: https://git.openjdk.org/jdk/pull/25356 From dchuyko at openjdk.org Wed Jun 11 16:52:31 2025 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Wed, 11 Jun 2025 16:52:31 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 12:14:54 GMT, Robbin Ehn wrote: > @dchuyko can you open a new issue and look at the acquire in downcall linker for aarch64 ? Thanks for the reminder, created JDK-8359252. It was intentional to limit the scope of the original change to JNI (due the amount of testing and usages). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25709#issuecomment-2963526953 From yzheng at openjdk.org Wed Jun 11 16:56:28 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 11 Jun 2025 16:56:28 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v4] In-Reply-To: References: Message-ID: On Tue, 27 May 2025 17:30:09 GMT, Tom Rodriguez wrote: >> Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision: >> >> address comments > > I think there are two levels of counters that we might want to disable. We definitely want to stop deopts and recompilations from marking the method not compilable which the current change does. Additionally JVMCIRuntime::register_method will perform this logic if validate_compile_task_dependencies fails and I don't think we want that. I think the new `!is_default` guard idiom should be in a helper like `nmethod::is_jvmci_hosted`. Do we use the hosted language elsewhere? > > The second level is to stop all counter updates in hosted compiles, for similar reasons. Those updates won't lead to disabling compilation but they will quickly lead to saturating of all the counters which is fairly pointless but probably benign. This would be done by setting `update_trap_state` to false for hosted nmethods. That also has the effect of keeping `inc_recompile_count` false. I think that's the right thing to do but I'd want to make sure that we test truffle workloads with those changes before making that change to make sure there isn't some subtle problem with that change. @tkrodriguez by `stop all counter updates in hosted compiles` you mean the trap-related counters, right? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25356#issuecomment-2963536923 From kvn at openjdk.org Wed Jun 11 17:06:28 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 11 Jun 2025 17:06:28 GMT Subject: RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: <2onH159vapanW54pdmzkjXrzcjJbHaVdMGi19_PGjUo=.307863f8-3bb1-4dd8-b110-44ae93fc2b67@github.com> On Wed, 11 Jun 2025 11:49:08 GMT, Tobias Hartmann wrote: > I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 > > But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 > > However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. > > I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. > > I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. > > @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? > > Thanks, > Tobias Looks good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25751#pullrequestreview-2918037996 From stefank at openjdk.org Wed Jun 11 17:07:33 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 11 Jun 2025 17:07:33 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> References: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> Message-ID: On Wed, 11 Jun 2025 08:35:30 GMT, Stefan Karlsson wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8357086: Fixed spaces in formatting in gc-related code. > > Are you up for making an experiment that changes `total_swap_space` and `free_swap_space` to return two values: one the actual value in `size_t` and the other an error code that gets set whenever we hit an error? > > The two proposed alternatives for this would be: > 1) returning something containing the two values (struct, Pair, Tuple, array, ...) > 2) Return the size_t and using an an out put parameter to signal an error (or vice versa) > > I think the fan-out of that will not be too bad and the likely outcome is that it is clearer that the code is propagating errors. > @stefank I inspected the container-related code once again, and came to conclusion that it is safe to use ssize_t, as you suggested above initially. The `OSCONTAINER_ERROR` return value will not be returned by `free_swap_space()` in os_linux.cpp as well as in anywhere in that file, because in all places there is a check for non-negativity. If negative, then the flow falls back to the host method, which can return value >= -1, i.e. fitting `ssize_t`. I will push these changes soon once tested. Hmm. Maybe I'm reading this wrong, but to me it looks like you can return -2 via this code: jlong CgroupV1MemoryController::read_memory_limit_in_bytes(julong phys_mem) { julong memlimit; CONTAINER_READ_NUMBER_CHECKED(reader(), "/memory.limit_in_bytes", "Memory Limit", memlimit); if (memlimit >= phys_mem) { verbose_log(memlimit, phys_mem); return (jlong)-1; } else { verbose_log(memlimit, phys_mem); return (jlong)memlimit; } } Note how `CONTAINER_READ_NUMBER_CHECKED` is a macro with a return statement: #define CONTAINER_READ_NUMBER_CHECKED(controller, filename, log_string, retval) \ { \ bool is_ok; \ is_ok = controller->read_number(filename, &retval); \ if (!is_ok) { \ log_trace(os, container)(log_string " failed: %d", OSCONTAINER_ERROR); \ return OSCONTAINER_ERROR; \ } \ log_trace(os, container)(log_string " is: " JULONG_FORMAT, retval); \ } (and `#define OSCONTAINER_ERROR (-2)`) I think that the compiler would have caught that if we were to perform the experiment to change the cgroup code to return the proposed (size_t, error) pair and fix all callers to deal with that. With that said, if we have Kim's buy-in to put `-2` in `ssize_t` variables then I think that your patch retains the current behavior of the code and then we could defer an experiment like this to a PR that cleans up the cgroup code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2963566574 From iklam at openjdk.org Wed Jun 11 17:09:42 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 11 Jun 2025 17:09:42 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <-XuFmpo5Oj6x-mKlpgSseo5jJUZdY9GXo2Hm17pue0I=.4da01a2c-2c37-4bb3-a038-467746ad1582@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> <-XuFmpo5Oj6x-mKlpgSseo5jJUZdY9GXo2Hm17pue0I=.4da01a2c-2c37-4bb3-a038-467746ad1582@github.com> Message-ID: On Wed, 11 Jun 2025 16:35:01 GMT, Radim Vansa wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Copyright update > > Thank you! I'll mark this for integration and I would appreciate if I can get your sponsorship. @rvansa the base version of this PR is quite old. Please merge with the latest JDK repo before we integrate it into the mainline. How much testing have you done on your side? On what platforms? After you merge, someone at Oracle will run it on our testing pipeline and will sponsor it after all tests are clean. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2963573573 From stefank at openjdk.org Wed Jun 11 17:11:35 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 11 Jun 2025 17:11:35 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v5] In-Reply-To: References: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> Message-ID: On Wed, 11 Jun 2025 14:30:54 GMT, Severin Gehwolf wrote: >> Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs >> - 8357086: Fixed spaces in formatting in gc-related code. >> - 8357086: Fixed formatting. >> - 8357086: Addressed reviewer's comments. >> - 8357086: More work. >> - 8357086: More work. >> - 8357086: More work. >> - 8357086: More work. >> - 8357086: More work. >> - 8357086: More work >> - ... and 2 more: https://git.openjdk.org/jdk/compare/058486c5...f3a5f61c > > src/hotspot/os/linux/os_linux.cpp line 261: > >> 259: } >> 260: log_trace(os)("available memory: " JULONG_FORMAT, avail_mem); >> 261: return static_cast(avail_mem); > > Line 243 should probably receive the same treatment (of `static_cast`)? And 258 as well. Maybe we don't need the static cast here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2140692110 From stefank at openjdk.org Wed Jun 11 17:15:31 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 11 Jun 2025 17:15:31 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v5] In-Reply-To: References: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> Message-ID: On Wed, 11 Jun 2025 17:08:54 GMT, Stefan Karlsson wrote: >> src/hotspot/os/linux/os_linux.cpp line 261: >> >>> 259: } >>> 260: log_trace(os)("available memory: " JULONG_FORMAT, avail_mem); >>> 261: return static_cast(avail_mem); >> >> Line 243 should probably receive the same treatment (of `static_cast`)? > > And 258 as well. Maybe we don't need the static cast here? Hmm. Isn't 257 redundant because we already check for this on line 241 and the code between should never set the `avail_mem` to `-1`. Maybe this code needs some extra scrutiny as well (as a follow-up) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2140698836 From never at openjdk.org Wed Jun 11 17:18:47 2025 From: never at openjdk.org (Tom Rodriguez) Date: Wed, 11 Jun 2025 17:18:47 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v5] In-Reply-To: References: Message-ID: <0g305fzs1X2bbkjCduwcEC8XHf6hVsWXuDpfW5Hq9TI=.878a8aaf-e376-4fae-a559-c0b60bba7d4b@github.com> On Wed, 11 Jun 2025 16:49:53 GMT, Yudi Zheng wrote: >> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods. > > Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - address comments > - Merge remote-tracking branch 'upstream/master' into JDK-8357424 > - address comments > - address comments > - update copyright > - [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod Yes. Any update to the MDO for hosted compiles doesn't seem useful I think. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25356#issuecomment-2963594199 From coleenp at openjdk.org Wed Jun 11 17:18:40 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 11 Jun 2025 17:18:40 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update I've been testing this against mainline source base -with just this change patched in and it seems fine. I don't know if it's worth merging. Also if you do merge it do 'git merge' not 'git rebase' I'm rerunning tier1-7 now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2963594231 From yzheng at openjdk.org Wed Jun 11 17:18:44 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 11 Jun 2025 17:18:44 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v6] In-Reply-To: References: Message-ID: <3VLvf-Aw_4Xbqf4EIs29YKxTamuA21D6pNRs9UkAWAM=.06906268-6067-412f-922a-3c3a3ec896e4@github.com> > Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods. Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision: fix compilation error ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25356/files - new: https://git.openjdk.org/jdk/pull/25356/files/57fe5307..9d24428e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=04-05 Stats: 7 lines in 2 files changed: 4 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25356.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356 PR: https://git.openjdk.org/jdk/pull/25356 From cjplummer at openjdk.org Wed Jun 11 17:45:33 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 11 Jun 2025 17:45:33 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v8] In-Reply-To: <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> Message-ID: On Wed, 11 Jun 2025 11:42:13 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment I'm just giving my approval for how the comments are being handled. I haven't looked at the implementation details, nor for the correctness of the comments. With the updates svc tests are no longer part of this PR. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2918156391 From iklam at openjdk.org Wed Jun 11 17:46:37 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 11 Jun 2025 17:46:37 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: On Wed, 11 Jun 2025 17:15:33 GMT, Coleen Phillimore wrote: > I've been testing this against mainline source base -with just this change patched in and it seems fine. I don't know if it's worth merging. Also if you do merge it do 'git merge' not 'git rebase' I'm rerunning tier1-7 now. @rvansa since Coleen is already testing against the mainline, there's no need for you to merge now. We will sponsor once the tests come out clean. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2963670758 From sspitsyn at openjdk.org Wed Jun 11 18:12:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 11 Jun 2025 18:12:49 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v3] In-Reply-To: References: Message-ID: > The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. > I treat this as a bug and doubt we need a CSR for this issue. > > Testing: N/A Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: add clarification for catch_location == 0 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25710/files - new: https://git.openjdk.org/jdk/pull/25710/files/021756c5..ad760f49 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25710&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25710&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25710/head:pull/25710 PR: https://git.openjdk.org/jdk/pull/25710 From sspitsyn at openjdk.org Wed Jun 11 18:12:50 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 11 Jun 2025 18:12:50 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v3] In-Reply-To: <1WQwUmMEMQNcadwD_-27loMxuOwCppkF7-tWKqoSbkw=.59d255ad-9dc1-4a7a-83d6-bf972064e69a@github.com> References: <3e2U0lztN76iKtiYtQaDlsPTknjRuubBJCq1azuj1Lk=.0a5e463b-9012-4a03-9a69-b2b0b0c6c601@github.com> <1_ZzX5NOC8YIn_oKpPBpI62WMNVdVI8JZLMwqpif0Vs=.d4840bcd-80ae-4b70-8b04-ab2424eb7ba5@github.com> <1WQwUmMEMQNcadwD_-27loMxuOwCppkF7-tWKqoSbkw=.59d255ad-9dc1-4a7a-83d6-bf972064e69a@github.com> Message-ID: On Wed, 11 Jun 2025 06:00:31 GMT, Alan Bateman wrote: >> Thank you for the comments! >> Yes, I've also come to the same conclusion about the only `catch_method == null` check. >> >>> Then you also need to fix: >>> "If there is no such catch clause, each field is set to 0." >> >> Good catch, thanks. >> >> The suggestions above have addressed now. >> Also, I've replaced the term `field` with `parameter` for consistency in two spots. > > I think what you have looks okay now. I'm just wondering about the description for the catch_location parameter has "zero if no known catch". So if catch_method is null then catch_location must be 0. Thank you, Alan. I've updated the event description. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25710#discussion_r2140791431 From alanb at openjdk.org Wed Jun 11 18:39:32 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 11 Jun 2025 18:39:32 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v3] In-Reply-To: References: Message-ID: <7LgMMgIW-mm0MIJ5kcr6xLwqooO4qI1zDNRptlhvdhE=.b17d1ec0-721b-47c3-af91-924d632c1877@github.com> On Wed, 11 Jun 2025 18:12:49 GMT, Serguei Spitsyn wrote: >> The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. >> I treat this as a bug and doubt we need a CSR for this issue. >> >> Testing: N/A > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: add clarification for catch_location == 0 Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25710#pullrequestreview-2918308260 From cjplummer at openjdk.org Wed Jun 11 18:45:27 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 11 Jun 2025 18:45:27 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v3] In-Reply-To: References: Message-ID: <2UEDtJBtuJ73dOL46-iH8fA1L9ay1UXHTVnmW3prHM4=.6cad3a30-2d07-485b-96bc-a5df047d6171@github.com> On Wed, 11 Jun 2025 18:12:49 GMT, Serguei Spitsyn wrote: >> The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. >> I treat this as a bug and doubt we need a CSR for this issue. >> >> Testing: N/A > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: add clarification for catch_location == 0 Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25710#pullrequestreview-2918324098 From sspitsyn at openjdk.org Wed Jun 11 18:54:33 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 11 Jun 2025 18:54:33 GMT Subject: RFR: 8358815: Exception event spec has stale reference to catch_klass parameter [v3] In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 18:12:49 GMT, Serguei Spitsyn wrote: >> The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. >> I treat this as a bug and doubt we need a CSR for this issue. >> >> Testing: N/A > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: add clarification for catch_location == 0 Alan and Chris, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25710#issuecomment-2963840971 From sspitsyn at openjdk.org Wed Jun 11 18:54:33 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 11 Jun 2025 18:54:33 GMT Subject: Integrated: 8358815: Exception event spec has stale reference to catch_klass parameter In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 07:17:17 GMT, Serguei Spitsyn wrote: > The JVMTI Exception event callback spec refers to the `catch_klass` parameter which does not exist anymore. Instead the Exception event callback spec should refer to the `catch_method` and `catch_location` parameters. > I treat this as a bug and doubt we need a CSR for this issue. > > Testing: N/A This pull request has now been integrated. Changeset: 8f733570 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/8f733570040a7d7a24775e72244f47e946af191b Stats: 6 lines in 1 file changed: 1 ins; 0 del; 5 mod 8358815: Exception event spec has stale reference to catch_klass parameter Reviewed-by: cjplummer, alanb ------------- PR: https://git.openjdk.org/jdk/pull/25710 From stuefe at openjdk.org Wed Jun 11 19:20:31 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 11 Jun 2025 19:20:31 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v3] In-Reply-To: References: Message-ID: <14leaLRIJu4zPu7wBc5tQIDiqJEIkHmesxRy9yRUkUw=.8e4619c9-d164-4e63-99fe-033f5ec32e5b@github.com> On Thu, 29 May 2025 06:37:36 GMT, David Holmes wrote: > > I talked to Anton offline about the ptrdiff_t. That type has the correct signedness and number of bits on all our platforms, but to me that type carries a semantic meaning about what pointer diffs / indices. Because of that I found it inappropriate to use that type. > > Those were my thoughts as well when I saw that change made. I don't have a good answer either. It is not at all clear how/why the swap functions were allowed to report errors when none of the others do. It is not documented in os.hpp nor required by the `OperatingSystemMXBean` API! > Even disregarding Linux container case, looking at `available_memory()`, `total_swap_space()` and `free_swap_space()` I see that error handling is inconsistent. At the moment: - in Windows, we call `GlobalMemoryStatusEx` and then ignore any errors, potentially returning whatever garbage had been in the `MEMORYSTATUSEX` structures when we called the function. - on AIX, we use the libperfstat for all three. If it fails, we return UINT64_MAX for error in `available_memory()`. `total_swap_space()` and `free_swap_space()` both return -1. - on BSD, we always return physical memory / 2 for `available_memory()` (?!) Feels similar to the logic of heap size default? - on Mac, we use host_statistics64() for `available_memory()`, but return physical memory / 2 in case of error (?!). Both swap functions return -1. - on bare metal Linux, - `available_memory()` attempts to read /proc/meminfo; failing that, we return - eventually - free memory info from `sysinfo(2)`; that can also fail, but we ignore that. This may be fine; the only documented way sysinfo can fail is via programmer error. - `total_swap_space()` and `free_swap_space()` both read `sysinfo(2)` too, but now they we do not ignore the error but return -1 instead. Inconsistent with available_memory() This may have been more effort than I envisioned when creating this issue, sorry for that. I would love it if we could unify these behaviours. I dislike that the swap APIs return signed values. This cuts off half of the value range and feels unnatural and out of sync with the other memory APIs. Admittedly, a machine with more than half its address range as swap space is unlikely (BTW, interesting question is 32-bit: e.g. on Windows, with address space extension, we have 3GB user address space; could we have >2GB swap space? Maybe not). What we could do: A) return (size_t)-1. That leaves almost the full value range and excludes a value that will never appear in earnest. We could codify that as a handy constant (similar to MAP_FAILED on Posix) B) return 0. Similar reasoning to above. C) return a "sensible default" like some APIs do today. D) a separate error reporting variable. It feels a bit overengineered, though. E) swallow errors in release, assert in debug. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2963901553 From shade at openjdk.org Wed Jun 11 19:22:32 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 11 Jun 2025 19:22:32 GMT Subject: RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 11:49:08 GMT, Tobias Hartmann wrote: > I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 > > But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 > > However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. > > I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. > > I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. > > @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? > > Thanks, > Tobias Aw, fish, these bugs are annoying. I have a further suggestion: src/hotspot/share/opto/block.cpp line 41: > 39: > 40: void Block_Array::grow(uint i) { > 41: assert(i >= Max(), "must be an overflow"); Assert message here is misleading: it is more likely someone had called `grow` when they intended `maybe_grow`. See how it is done elsewhere: void Node_Array::grow(uint i) { _nesting.check(_a); // Check if a potential reallocation in the arena is safe assert(i >= _max, "Should have been checked before, use maybe_grow?"); ------------- PR Review: https://git.openjdk.org/jdk/pull/25751#pullrequestreview-2918411519 PR Review Comment: https://git.openjdk.org/jdk/pull/25751#discussion_r2140899439 From shade at openjdk.org Wed Jun 11 19:22:33 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 11 Jun 2025 19:22:33 GMT Subject: RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 19:16:57 GMT, Aleksey Shipilev wrote: >> I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 >> >> But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 >> >> However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. >> >> I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. >> >> I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. >> >> @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? >> >> Thanks, >> Tobias > > src/hotspot/share/opto/block.cpp line 41: > >> 39: >> 40: void Block_Array::grow(uint i) { >> 41: assert(i >= Max(), "must be an overflow"); > > Assert message here is misleading: it is more likely someone had called `grow` when they intended `maybe_grow`. See how it is done elsewhere: > > > void Node_Array::grow(uint i) { > _nesting.check(_a); // Check if a potential reallocation in the arena is safe > assert(i >= _max, "Should have been checked before, use maybe_grow?"); Speaking of, we should probably move `_nesting.check(_a);` to `Node_Array::maybe_grow` as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25751#discussion_r2140901654 From lmesnik at openjdk.org Wed Jun 11 20:33:29 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 11 Jun 2025 20:33:29 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v8] In-Reply-To: <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> Message-ID: On Wed, 11 Jun 2025 11:42:13 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2918587183 From serb at openjdk.org Wed Jun 11 20:34:39 2025 From: serb at openjdk.org (Sergey Bylokhov) Date: Wed, 11 Jun 2025 20:34:39 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: References: Message-ID: <1X9_-qoQz-8rsRd3ru68GLPH3d68f3ySLU6NhzWU0IE=.4e6ef66a-0c3c-4d2c-80e1-71d3a286a208@github.com> On Thu, 5 Jun 2025 06:10:09 GMT, Johannes Bechberger wrote: > This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509). > > Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with > > - ... different heap sizes > - ... different GCs > - ... different samplers (the standard JFR and the new CPU Time Sampler and both) > - ... different JFR recording durations > - ... different chunk-sizes It seems the new test TestCPUTimeSampleMultipleRecordings fails most of the time on the systems with 90+G of memory. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2964069653 From sspitsyn at openjdk.org Wed Jun 11 21:14:30 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 11 Jun 2025 21:14:30 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v8] In-Reply-To: <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> Message-ID: <96XsVjWh2yOjtHrdPEzBDmiTe0t3TdrNq288WZ4Dyg8=.6ba138e9-e616-4fda-b012-14c2d9c7805c@github.com> On Wed, 11 Jun 2025 11:42:13 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment This looks good. I've posted one nit though. src/hotspot/share/prims/whitebox.hpp line 77: > 75: static bool is_asan_enabled(); > 76: static bool is_ubsan_enabled(); > 77: }; Nit: I'd suggest to add a short comments explaining what `asan` and `ubsan` mean. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2918761842 PR Review Comment: https://git.openjdk.org/jdk/pull/25575#discussion_r2141125695 From kvn at openjdk.org Wed Jun 11 23:17:05 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 11 Jun 2025 23:17:05 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early Message-ID: Thanks to @shipilev for catching the issue. [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. I also did some cleanup to match `leyden/premain` branch for easy merges. Tested hs-tier1-6, hs-tier1-rt, stress, xcomp ------------- Commit messages: - remove trailing whitespace - 8358690: Some initialization code asks for AOT cache status way too early Changes: https://git.openjdk.org/jdk/pull/25763/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25763&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358690 Stats: 172 lines in 14 files changed: 99 ins; 28 del; 45 mod Patch: https://git.openjdk.org/jdk/pull/25763.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25763/head:pull/25763 PR: https://git.openjdk.org/jdk/pull/25763 From kvn at openjdk.org Wed Jun 11 23:21:28 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 11 Jun 2025 23:21:28 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: <9B18Ef2GIDrFT979xI0nfgdc35DIuRQCObPE9q-ByvA=.278b9d27-04a7-4049-96d8-83fe002a41c4@github.com> On Wed, 11 Jun 2025 23:08:44 GMT, Vladimir Kozlov wrote: > Thanks to @shipilev for catching the issue. > > [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. > > This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. > > We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). > > We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. > > The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. > > I also did some cleanup to match `leyden/premain` branch for easy merges. > > Tested hs-tier1-6, hs-tier1-rt, stress, xcomp src/hotspot/share/cds/metaspaceShared.cpp line 2017: > 2015: TrainingData::print_archived_training_data_on(tty); > 2016: > 2017: AOTCodeCache::print_on(tty); Move checks inside callee. src/hotspot/share/code/aotCodeCache.cpp line 2: > 1: /* > 2: * Copyright (c) 2023, 2025, Oracle and/or its affiliates. All rights reserved. Match leyden/premain src/hotspot/share/code/aotCodeCache.cpp line 109: > 107: // Next methods determine which action we do with AOT code depending > 108: // on phase of AOT process: assembly or production. > 109: This comment and following new AOT functions are added to match leyden/premain branch. src/hotspot/share/code/aotCodeCache.cpp line 1741: > 1739: // This is called after initialize() but before init2() > 1740: // and _cache is not set yet. > 1741: void AOTCodeCache::print_on(outputStream* st) { Refactored because it could be called early. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25763#discussion_r2141250902 PR Review Comment: https://git.openjdk.org/jdk/pull/25763#discussion_r2141251200 PR Review Comment: https://git.openjdk.org/jdk/pull/25763#discussion_r2141251871 PR Review Comment: https://git.openjdk.org/jdk/pull/25763#discussion_r2141253105 From dlong at openjdk.org Thu Jun 12 01:56:34 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 12 Jun 2025 01:56:34 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead Message-ID: This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. ------------- Commit messages: - cleanup - wip Changes: https://git.openjdk.org/jdk/pull/25764/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358821 Stats: 251 lines in 19 files changed: 90 ins; 131 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From dlong at openjdk.org Thu Jun 12 02:01:47 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 12 Jun 2025 02:01:47 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v2] In-Reply-To: References: Message-ID: <8oM2Y0kAMRN6wxtjAmpXDWTcHDZ6gPrNM-8PPtukwAA=.dde201dd-8c2a-4a16-96e5-ca92604e6edd@github.com> > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request incrementally with two additional commits since the last revision: - ... and stale code - removed stale comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25764/files - new: https://git.openjdk.org/jdk/pull/25764/files/b20fb26f..0780d156 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From syan at openjdk.org Thu Jun 12 02:29:30 2025 From: syan at openjdk.org (SendaoYan) Date: Thu, 12 Jun 2025 02:29:30 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v8] In-Reply-To: <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> <5V3KDrOL5tc0ay5zbkGeGhY59491l5USEZjCG1kjjWA=.1e124b44-da11-4af4-a5b6-94c53e7860a8@github.com> Message-ID: On Wed, 11 Jun 2025 11:42:13 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Thanks for fixing this. ------------- Marked as reviewed by syan (Committer). PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2919282949 From wenanjian at openjdk.org Thu Jun 12 03:22:10 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Thu, 12 Jun 2025 03:22:10 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls [v2] In-Reply-To: References: Message-ID: > Acquire fence removal in safepoint poll > > At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. > > Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in > JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence. > > [0] https://github.com/openjdk/jdk/pull/20420 Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: RISC-V: delete the acquire argument in safepoint_poll since there is no use ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25709/files - new: https://git.openjdk.org/jdk/pull/25709/files/5ca62af4..47065b4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25709&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25709&range=00-01 Stats: 14 lines in 8 files changed: 0 ins; 5 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/25709.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25709/head:pull/25709 PR: https://git.openjdk.org/jdk/pull/25709 From wenanjian at openjdk.org Thu Jun 12 03:22:10 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Thu, 12 Jun 2025 03:22:10 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 06:33:34 GMT, Anjian Wen wrote: > Acquire fence removal in safepoint poll > > At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. > > Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in > JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence. > > [0] https://github.com/openjdk/jdk/pull/20420 Thanks for your review!! I have changed the description. About the safepoint_poll, I think it makes sence that there is no need to keep "bool acquire" argument when we change the last 'true' in downcallLinker_riscv.cpp, and I have updated the patch. > Hey, sure. > > The description says "backport of", as you are changing master I think that description is wrong. > > The downcallLinker_riscv.cpp is the same case as the native transition. So I don't see why you would keep that acquire ? > > AFIACT no one should use acquire, thus this should then mean that we can remove "bool acquire" argument from safepoint_poll(). > > That arm still have acquire in their downlinker seems like an oversight? 8337657 only have one reviewer, I think it should have been cought there. (please note https://wiki.openjdk.org/display/HotSpot/Pushing+a+HotSpot+change, two reviewers required) @dchuyko can you open a new issue and look at the acquire in downcall linker for aarch64 ? > > Thanks, Robbin ------------- PR Comment: https://git.openjdk.org/jdk/pull/25709#issuecomment-2964941725 From kvn at openjdk.org Thu Jun 12 04:47:27 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 12 Jun 2025 04:47:27 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 23:08:44 GMT, Vladimir Kozlov wrote: > Thanks to @shipilev for catching the issue. > > [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. > > This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. > > We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). > > We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. > > The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. > > I also did some cleanup to match `leyden/premain` branch for easy merges. > > Tested hs-tier1-6, hs-tier1-rt, stress, xcomp I found that `StubRoutines::_fence_entry` from initial stubs is used by [OrderAccess::fence()](https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/windows_x86/orderAccess_windows_x86.hpp#L50) on Windows-x64 and only. And `OrderAccess::fence()` is used by GC worked threads which are started by `universe_init()`. I am work on fixing it. I hope I don't need to move all initial stubs. May be 'pre-initial` stubs for this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2965092085 From mbaesken at openjdk.org Thu Jun 12 05:44:45 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 12 Jun 2025 05:44:45 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v9] In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Add comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25575/files - new: https://git.openjdk.org/jdk/pull/25575/files/9aa58582..62a389d7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25575&range=07-08 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25575.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25575/head:pull/25575 PR: https://git.openjdk.org/jdk/pull/25575 From mbaesken at openjdk.org Thu Jun 12 06:51:35 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 12 Jun 2025 06:51:35 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v9] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Thu, 12 Jun 2025 05:44:45 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Thanks for the reviews ! Need a re - review now because of the comment change . ------------- PR Comment: https://git.openjdk.org/jdk/pull/25575#issuecomment-2965331120 From chagedorn at openjdk.org Thu Jun 12 06:53:39 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Thu, 12 Jun 2025 06:53:39 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v7] In-Reply-To: References: Message-ID: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> On Tue, 10 Jun 2025 14:44:48 GMT, Emanuel Peter wrote: >> **Past Work** >> With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. >> >> **This PR** >> I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. >> >> I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. >> >> My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. >> >> **Future Work:** >> In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. >> >> I filed: >> [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) >> (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) >> >> Testing passed tier1-3, with extra timeout factor 20. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > reorder flags for Christian Some more minor comments but otherwise looks good, thanks for the updates and for the offline discussion to share some background! > @chhagedorn I checked with @TobiHartmann : he said he does not have a strong opinion, but if he had to make a decision, he would prefers having everything in the comments. Then let's go with everything in the comments - might be better in the end since we eventually want to replace the individual comments with final verdicts or have the cases fixed anyway at some point :-) src/hotspot/share/opto/phaseX.cpp line 1204: > 1202: switch (n->Opcode()) { > 1203: // RangeCheckNode::Ideal looks up the chain for about 999 nodes > 1204: // see "Range-Check scan limit". So it is possible that something Suggestion: // (see "Range-Check scan limit"). So, it is possible that something src/hotspot/share/opto/phaseX.cpp line 1205: > 1203: // RangeCheckNode::Ideal looks up the chain for about 999 nodes > 1204: // see "Range-Check scan limit". So it is possible that something > 1205: // optimized in that input subgraph, and the RangeCheck was not Suggestion: // is optimized in that input subgraph, and the RangeCheck was not src/hotspot/share/opto/phaseX.cpp line 1256: > 1254: // "Useless" means that there is no code in either branch of the If. > 1255: // I found a case where this was not done yet during IGVN. > 1256: // Why does the Region not get added to IGVN worklist when the If diamond becomes useles? Suggestion: // Why does the Region not get added to IGVN worklist when the If diamond becomes useless? src/hotspot/share/opto/phaseX.cpp line 1316: > 1314: case Op_AddD: > 1315: //case Op_AddI: // Also affected for other reasons. > 1316: //case Op_AddL: // Also affected for other reasons. Suggestion: //case Op_AddI: // Also affected for other reasons, see case further down. //case Op_AddL: // Also affected for other reasons, see case further down. src/hotspot/share/opto/phaseX.cpp line 1432: > 1430: // x + (0 - [8424 AddL]) > 1431: // but the AddL was not added to the IGVN worklist. Investigate why. > 1432: // There could be other issues too. For example with "commute", see above. Suggestion: // There could be other issues, too. For example with "commute", see above. src/hotspot/share/opto/phaseX.cpp line 1444: > 1442: // This has the effect that these new nodes end up on the IGVN worklist, > 1443: // but if we now leave verification and IGVN itself, we have nodes on the > 1444: // worklist, and that should not be (there are asserts against this). Sounds like we just need some exception when calling `Ideal` on a `SubTypeCheck` that we can have certain nodes still on the worklist like a `cmp`. Maybe we should add this as a suggestion to the comment? Suggestion: // but if we now leave verification and IGVN itself, we have nodes other // than 'n' still on the worklist. This will fail with an assert in // verify_empty_worklist(). Maybe we just need to add an exception and // check that only certain nodes like 'cmp' are still on the worklist. After // this check, we can clear the worklist such that verify_empty_worklist() // succeeds. src/hotspot/share/opto/phaseX.cpp line 1685: > 1683: // Found in tier1-3. > 1684: case Op_CMoveI: > 1685: return false; Maybe merge them together and add a comment that you have not investigated further (I assume?) since you found them all in tier1-3 without more specific details. ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22970#pullrequestreview-2913983773 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2138142398 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2138143758 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2138146868 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2138150068 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2141804882 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2141810963 PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2141830386 From amitkumar at openjdk.org Thu Jun 12 06:59:32 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 12 Jun 2025 06:59:32 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v9] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Thu, 12 Jun 2025 05:44:45 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Marked as reviewed by amitkumar (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2919765059 From stefank at openjdk.org Thu Jun 12 07:05:31 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 12 Jun 2025 07:05:31 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v3] In-Reply-To: <14leaLRIJu4zPu7wBc5tQIDiqJEIkHmesxRy9yRUkUw=.8e4619c9-d164-4e63-99fe-033f5ec32e5b@github.com> References: <14leaLRIJu4zPu7wBc5tQIDiqJEIkHmesxRy9yRUkUw=.8e4619c9-d164-4e63-99fe-033f5ec32e5b@github.com> Message-ID: On Wed, 11 Jun 2025 19:17:24 GMT, Thomas Stuefe wrote: > D) a separate error reporting variable. It feels a bit overengineered, though. I think I disagree with this. The status quo makes it all too easy to miss that you need to check the return value for an error. IMHO, this is a growing ground for bugs when someone starts to use any of these APIs with realizing that we're mixing two values (and two types) into one return value. Just 2c, to reiterate that *I* wouldn't mind a solution like this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2965369908 From sspitsyn at openjdk.org Thu Jun 12 07:08:34 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 12 Jun 2025 07:08:34 GMT Subject: RFR: 8357826: Avoid running some jtreg tests when asan is configured [v9] In-Reply-To: References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Thu, 12 Jun 2025 05:44:45 GMT, Matthias Baesken wrote: >> There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . >> Those fail when the address sanitizer is configured ( --enable-asan ). >> The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. >> Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . >> While at it, also same is also added for ubsan . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25575#pullrequestreview-2919792257 From mbaesken at openjdk.org Thu Jun 12 07:11:37 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 12 Jun 2025 07:11:37 GMT Subject: Integrated: 8357826: Avoid running some jtreg tests when asan is configured In-Reply-To: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> References: <2VOsPdnaamydEfe2I-79af90nn9xlaRXULKEzrDHkGk=.7b237cd6-0a12-4ec2-8467-4177084b4468@github.com> Message-ID: On Mon, 2 Jun 2025 07:25:22 GMT, Matthias Baesken wrote: > There are a couple of jtreg tests, especially in the HS area, with very special assumptions about memory layout/sizes . > Those fail when the address sanitizer is configured ( --enable-asan ). > The change adds a way to tag those tests with 'requires' so that they can be avoided easily when running jtreg tests with ASAN enabled. > Adjusting the tests for "pleasing" the sanitizer is not always desired (if possible for some tests it can be done later) . > While at it, also same is also added for ubsan . This pull request has now been integrated. Changeset: d7aa3498 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/d7aa34982053bad37b3b726539f1245d054258f4 Stats: 64 lines in 12 files changed: 62 ins; 0 del; 2 mod 8357826: Avoid running some jtreg tests when asan is configured Reviewed-by: sspitsyn, amitkumar, lmesnik, syan, lucy, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/25575 From amitkumar at openjdk.org Thu Jun 12 07:14:28 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 12 Jun 2025 07:14:28 GMT Subject: RFR: 8359114: [s390x] Add z17 detection code [v2] In-Reply-To: <_fiM-Nhm3q5S2hCxa3quxpodBRmeIsCIBcA7AB4Hmcc=.2005b23d-fc05-4821-90b4-cf22a8d2442e@github.com> References: <_fiM-Nhm3q5S2hCxa3quxpodBRmeIsCIBcA7AB4Hmcc=.2005b23d-fc05-4821-90b4-cf22a8d2442e@github.com> Message-ID: On Wed, 11 Jun 2025 13:09:45 GMT, Manjunath S Matti. wrote: >> Add support to detect the new generation of Z machine (z17). > > Manjunath S Matti. has updated the pull request incrementally with one additional commit since the last revision: > > Correct the comments for the bits covered in DW[2] and DW[3]. LGTM. @RealLucy would you provide 2nd review ? I crashed JVM manually and in the hs_err file I see that z17 is detected properly; CPU: total 64 (initial active 64) system-z, g11-z17, ldisp_fast, extimm, pcrel_load/store, cmpb, cond_load/store, interlocked_update, txm, vectorinstr, instrext2, venh1, instrext3, venh2,bear_enh, sort_enh, nnpa_assist, storage_key_removal, vpack_decimal_enh, concurrent_function, out-of-support_as_of_tbd, aes128, aes192, aes256, sha1, sha256, sha512, ghash ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/25718#pullrequestreview-2919807992 PR Comment: https://git.openjdk.org/jdk/pull/25718#issuecomment-2965391369 From epeter at openjdk.org Thu Jun 12 07:14:37 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 07:14:37 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v7] In-Reply-To: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> References: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> Message-ID: On Thu, 12 Jun 2025 06:38:14 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> reorder flags for Christian > > src/hotspot/share/opto/phaseX.cpp line 1685: > >> 1683: // Found in tier1-3. >> 1684: case Op_CMoveI: >> 1685: return false; > > Maybe merge them together and add a comment that you have not investigated further (I assume?) since you found them all in tier1-3 without more specific details. I would prefer to keep them separate, so it is easier to remove them individually without getting merge conflicts. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2141889139 From aph at openjdk.org Thu Jun 12 07:47:30 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 12 Jun 2025 07:47:30 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 15:37:47 GMT, Mikhail Ablakatov wrote: >> In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. >> >> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. >> >> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: >> >> | Metric | Before | After | Difference | >> |-------------|---------------|---------------|------------| >> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | >> | | Sum: 6653848 | Sum: 6616344 | -0.56% | >> | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | >> | | Sum: 364376 | Sum: 308552 | -15.33% | >> >> Full jtreg passed on AArch64. > > Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: > > address review comments: use pd_patch_instruction directly > > MacroAssembler::pd_patch_instruction can distinguish between the `b` > and `movk movz movz br` sequences. Strictly speaking, the method > patches not a single instruction but a semantically joint sequence of > instructions. Use it directly instead of `NativeJump` and > `NativeGeneralJump` wrapper classes to simplify the implementation and > get rid of an extra icache invalidation. > > Other changes in the patch simply clean up code that became redundant. Looks good. Please fix the copyright date. src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp line 3: > 1: /* > 2: * Copyright (c) 1997, 2025, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2014, 2108, Red Hat Inc. All rights reserved. Suggestion: * Copyright (c) 2014, 2025, Red Hat Inc. All rights reserved. ------------- PR Review: https://git.openjdk.org/jdk/pull/25702#pullrequestreview-2919916984 PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2141954001 From epeter at openjdk.org Thu Jun 12 07:48:12 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 07:48:12 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v8] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Christian Hagedorn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/ffc54f6e..abfd3a27 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=06-07 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From epeter at openjdk.org Thu Jun 12 07:48:12 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 07:48:12 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v7] In-Reply-To: References: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> Message-ID: On Thu, 12 Jun 2025 07:11:42 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/phaseX.cpp line 1685: >> >>> 1683: // Found in tier1-3. >>> 1684: case Op_CMoveI: >>> 1685: return false; >> >> Maybe merge them together and add a comment that you have not investigated further (I assume?) since you found them all in tier1-3 without more specific details. > > I would prefer to keep them separate, so it is easier to remove them individually without getting merge conflicts. I'll add a comment that I did not investigate further yet. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2141954735 From epeter at openjdk.org Thu Jun 12 07:48:12 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 07:48:12 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v7] In-Reply-To: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> References: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> Message-ID: On Thu, 12 Jun 2025 06:24:40 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> reorder flags for Christian > > src/hotspot/share/opto/phaseX.cpp line 1444: > >> 1442: // This has the effect that these new nodes end up on the IGVN worklist, >> 1443: // but if we now leave verification and IGVN itself, we have nodes on the >> 1444: // worklist, and that should not be (there are asserts against this). > > Sounds like we just need some exception when calling `Ideal` on a `SubTypeCheck` that we can have certain nodes still on the worklist like a `cmp`. Maybe we should add this as a suggestion to the comment? > Suggestion: > > // but if we now leave verification and IGVN itself, we have nodes other > // than 'n' still on the worklist. This will fail with an assert in > // verify_empty_worklist(). Maybe we just need to add an exception and > // check that only certain nodes like 'cmp' are still on the worklist. After > // this check, we can clear the worklist such that verify_empty_worklist() > // succeeds. Thanks for the offline discussion, I'll write something new we discussed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22970#discussion_r2141953002 From epeter at openjdk.org Thu Jun 12 07:54:02 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 07:54:02 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v9] In-Reply-To: References: Message-ID: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: update comments for Christian ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/abfd3a27..f54d851a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=07-08 Stats: 19 lines in 1 file changed: 13 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From epeter at openjdk.org Thu Jun 12 07:54:02 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 07:54:02 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v7] In-Reply-To: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> References: <-CuBx8_SYEJ3zHZt4mmvofa65QOBAdSNexfmDXXBFvI=.c767b77a-0d4e-4fe1-9dc9-e796ee9a135b@github.com> Message-ID: On Thu, 12 Jun 2025 06:50:25 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> reorder flags for Christian > > Some more minor comments but otherwise looks good, thanks for the updates and for the offline discussion to share some background! > >> @chhagedorn I checked with @TobiHartmann : he said he does not have a strong opinion, but if he had to make a decision, he would prefers having everything in the comments. > > Then let's go with everything in the comments - might be better in the end since we eventually want to replace the individual comments with final verdicts or have the cases fixed anyway at some point :-) @chhagedorn Thanks for reviewing and the suggestions! I addressed them all :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/22970#issuecomment-2965521510 From duke at openjdk.org Thu Jun 12 08:05:29 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 12 Jun 2025 08:05:29 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: References: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> Message-ID: On Wed, 11 Jun 2025 17:04:39 GMT, Stefan Karlsson wrote: > Hmm. Maybe I'm reading this wrong, but to me it looks like you can return -2 via this code: Yes, you are correct that **that** code can return -2 in some cases. But we need to see where this code returns this value and if it is propagated further. I found only two places in os_linux.cpp where this value can be seen: 1) In `os::total_swap_space()`, but there is a check for non-negativity `if (OSContainer::memory_limit_in_bytes() > 0)`, which would be false in case `OSCONTAINER_ERROR` is returned, because under the hood `memory_limit_in_bytes()` uses same methods as you described. So value -2 will never be propagated outside of `os::total_swap_space()`. 2) In `os::free_swap_space()`, here both `mem_swap_limit `and `mem_limit `can be set to -2, but then there is a check `if (mem_swap_limit >= 0 && mem_limit >= 0)`, which would be false if -2 is in any of that variables. If so `host_free_swap_val` is returned, which has values >= -1. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2965559885 From chagedorn at openjdk.org Thu Jun 12 08:11:32 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Thu, 12 Jun 2025 08:11:32 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 07:54:02 GMT, Emanuel Peter wrote: >> **Past Work** >> With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. >> >> **This PR** >> I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. >> >> I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. >> >> My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. >> >> **Future Work:** >> In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. >> >> I filed: >> [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) >> (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) >> >> Testing passed tier1-3, with extra timeout factor 20. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > update comments for Christian Update looks good, thanks! ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22970#pullrequestreview-2920002022 From rehn at openjdk.org Thu Jun 12 08:34:30 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 12 Jun 2025 08:34:30 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls In-Reply-To: References: Message-ID: <--QF_NPANf_S-X_P3YVoNQ61qGVLr7NrX_ncuWA4kzc=.35b1d5f3-76cf-46a7-8665-3461b74559fc@github.com> On Wed, 11 Jun 2025 16:49:43 GMT, Dmitry Chuyko wrote: > > @dchuyko can you open a new issue and look at the acquire in downcall linker for aarch64 ? > > Thanks for the reminder, created JDK-8359252. It was intentional to limit the scope of the original change to JNI (due the amount of testing and usages). Thank you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25709#issuecomment-2965646168 From duke at openjdk.org Thu Jun 12 08:43:47 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 12 Jun 2025 08:43:47 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v7] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static size_t available_memory(); > static julong used_memory(); --> static size_t used_memory(); > static julong free_memory(); --> static size_t free_memory(); > static jlong total_swap_space(); --> static ssize_t total_swap_space(); > static jlong free_swap_space(); --> static ssize_t free_swap_space(); > static julong physical_memory(); --> static size_t physical_memory(); > > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8357086: Added missed casts. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/00d60415..2929f720 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Thu Jun 12 08:43:47 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 12 Jun 2025 08:43:47 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v5] In-Reply-To: References: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> Message-ID: On Wed, 11 Jun 2025 17:12:28 GMT, Stefan Karlsson wrote: >> And 258 as well. Maybe we don't need the static cast here? > > Hmm. Isn't 257 redundant because we already check for this on line 241 and the code between should never set the `avail_mem` to `-1`. Maybe this code needs some extra scrutiny as well (as a follow-up) Thanks for spotting this. Line 243 definitely needs treatment as well as line 258. Addressed in the latest commit. Line 257 does not look redundant to me. `avail_mem` could be set to large error value (cast from -1) on line 240, then lines between 246 and 256 can actually left untouched if something went wrong with reading meminfo file. Then we need to check if we still have an error value in `avail_mem`, and if so, use the one returned by `free_memory()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2142071689 From mablakatov at openjdk.org Thu Jun 12 08:45:29 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 12 Jun 2025 08:45:29 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 07:43:52 GMT, Andrew Haley wrote: >> Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: >> >> address review comments: use pd_patch_instruction directly >> >> MacroAssembler::pd_patch_instruction can distinguish between the `b` >> and `movk movz movz br` sequences. Strictly speaking, the method >> patches not a single instruction but a semantically joint sequence of >> instructions. Use it directly instead of `NativeJump` and >> `NativeGeneralJump` wrapper classes to simplify the implementation and >> get rid of an extra icache invalidation. >> >> Other changes in the patch simply clean up code that became redundant. > > src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 1997, 2025, Oracle and/or its affiliates. All rights reserved. >> 3: * Copyright (c) 2014, 2108, Red Hat Inc. All rights reserved. > > Suggestion: > > * Copyright (c) 2014, 2025, Red Hat Inc. All rights reserved. JIC, the patch doesn't touch this line. Git blame indicates the last time Red Hat's copyright was updated is 2018. Should I replace it with `2018` instead of `2025`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2142078515 From duke at openjdk.org Thu Jun 12 08:54:30 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 12 Jun 2025 08:54:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v7] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:43:47 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static size_t available_memory(); >> static julong used_memory(); --> static size_t used_memory(); >> static julong free_memory(); --> static size_t free_memory(); >> static jlong total_swap_space(); --> static ssize_t total_swap_space(); >> static jlong free_swap_space(); --> static ssize_t free_swap_space(); >> static julong physical_memory(); --> static size_t physical_memory(); >> >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Added missed casts. I think we could extend the usage of a large number `static_cast(-1)` for indication of error to other platforms similarly to how it is done in `os_linux.cpp`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2965706974 From mablakatov at openjdk.org Thu Jun 12 08:59:29 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 12 Jun 2025 08:59:29 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 07:44:24 GMT, Andrew Haley wrote: >> Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: >> >> address review comments: use pd_patch_instruction directly >> >> MacroAssembler::pd_patch_instruction can distinguish between the `b` >> and `movk movz movz br` sequences. Strictly speaking, the method >> patches not a single instruction but a semantically joint sequence of >> instructions. Use it directly instead of `NativeJump` and >> `NativeGeneralJump` wrapper classes to simplify the implementation and >> get rid of an extra icache invalidation. >> >> Other changes in the patch simply clean up code that became redundant. > > Looks good. Please fix the copyright date. The error in java/lang/Thread/virtual/stress/GetStackTraceALotWhenBlocking.java#id0 looks similar to what has been previously reported here: https://bugs.openjdk.org/browse/JDK-8344577 . @theRealAph , do you think the patch may cause the error? Or should I open a similar JBS ticket to report it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25702#issuecomment-2965724337 From syan at openjdk.org Thu Jun 12 09:00:38 2025 From: syan at openjdk.org (SendaoYan) Date: Thu, 12 Jun 2025 09:00:38 GMT Subject: RFR: 8358004: Delete applications/scimark/Scimark.java test In-Reply-To: <_o6ne7B1-A7nSOva7Zns8IMv7rw0Ve4xTwqiSajNzcA=.755977c7-4376-415c-9f76-5acae9f12b5a@github.com> References: <_o6ne7B1-A7nSOva7Zns8IMv7rw0Ve4xTwqiSajNzcA=.755977c7-4376-415c-9f76-5acae9f12b5a@github.com> Message-ID: On Tue, 20 May 2025 02:55:44 GMT, Leonid Mesnik wrote: > Test > scimark has a bug, described in the https://bugs.openjdk.org/browse/JDK-8315797 > that causes test failure. > > The Scimark is not maintained. The main goal of test was to provide example of Artifact-based test with 3rd party binary. There are a couple of other tests using Artifactory. So this test is completely useless now. > I am removing it just to avoid spending time for anyone who can run test and observe this failure. LGTM ------------- Marked as reviewed by syan (Committer). PR Review: https://git.openjdk.org/jdk/pull/25316#pullrequestreview-2920150175 From adinn at openjdk.org Thu Jun 12 09:08:28 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 12 Jun 2025 09:08:28 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 04:44:24 GMT, Vladimir Kozlov wrote: >> Thanks to @shipilev for catching the issue. >> >> [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. >> >> This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. >> >> We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). >> >> We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. >> >> The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. >> >> I also did some cleanup to match `leyden/premain` branch for easy merges. >> >> Tested hs-tier1-6, hs-tier1-rt, stress, xcomp > > I found that `StubRoutines::_fence_entry` from initial stubs is used by [OrderAccess::fence()](https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/windows_x86/orderAccess_windows_x86.hpp#L50) on Windows-x64 and only. And `OrderAccess::fence()` is used by GC worked threads which are started by `universe_init()`. > > I am work on fixing it. I hope I don't need to move all initial stubs. May be 'pre-initial` stubs for this? @vnkozlov Are you suggesting moving the fence stubs to a separate StubGen preinitial blob created before the initial blob? That should be relatively easy to do. If you want me to push that change first in a separate PR I will be happy to do so. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2965753322 From epeter at openjdk.org Thu Jun 12 09:12:03 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 09:12:03 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v10] In-Reply-To: References: Message-ID: <1C0ByMoDpDlOmbDQVgBTQg7yKI0UaLtX92Xmf0bta4E=.0c060c5a-d60d-4cbe-84c5-03884116ef34@github.com> > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 79 additional commits since the last revision: - Merge branch 'master' into JDK-8347273-verify-IGVN-Ideal-Identity - update comments for Christian - Apply suggestions from code review Co-authored-by: Christian Hagedorn - reorder flags for Christian - max_modes - use stringStream instead of ttyLocker - assert(false) for Christian - rename for Christian - Update src/hotspot/share/opto/phaseX.cpp Co-authored-by: Manuel H?ssig - review suggestions, and handled a few more edge cases - ... and 69 more: https://git.openjdk.org/jdk/compare/84e59324...d9546d87 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22970/files - new: https://git.openjdk.org/jdk/pull/22970/files/f54d851a..d9546d87 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22970&range=08-09 Stats: 6953 lines in 245 files changed: 3281 ins; 3006 del; 666 mod Patch: https://git.openjdk.org/jdk/pull/22970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22970/head:pull/22970 PR: https://git.openjdk.org/jdk/pull/22970 From jbechberger at openjdk.org Thu Jun 12 09:37:47 2025 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Thu, 12 Jun 2025 09:37:47 GMT Subject: RFR: 8358666: [REDO] Implement JEP 509: JFR CPU-Time Profiling In-Reply-To: <1X9_-qoQz-8rsRd3ru68GLPH3d68f3ySLU6NhzWU0IE=.4e6ef66a-0c3c-4d2c-80e1-71d3a286a208@github.com> References: <1X9_-qoQz-8rsRd3ru68GLPH3d68f3ySLU6NhzWU0IE=.4e6ef66a-0c3c-4d2c-80e1-71d3a286a208@github.com> Message-ID: On Wed, 11 Jun 2025 20:31:35 GMT, Sergey Bylokhov wrote: > It seems the new test TestCPUTimeSampleMultipleRecordings fails most of the time on the systems with 90+G of memory. I can't reproduce this on a machine with 128G of RAM, can you give me more details? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25654#issuecomment-2965873783 From stefank at openjdk.org Thu Jun 12 09:52:30 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 12 Jun 2025 09:52:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: References: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> Message-ID: On Thu, 12 Jun 2025 08:02:55 GMT, Anton Artemov wrote: > > Hmm. Maybe I'm reading this wrong, but to me it looks like you can return -2 via this code: > > Yes, you are correct that **that** code can return -2 in some cases. But we need to see where this code returns this value and if it is propagated further. I followed the paths you indicated and I agree that we don't propagate -2 here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2965932235 From rehn at openjdk.org Thu Jun 12 10:16:28 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 12 Jun 2025 10:16:28 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 03:22:10 GMT, Anjian Wen wrote: >> Acquire fence removal in safepoint_poll >> >> At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. >> >> Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in >> JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence in safepoint_poll. >> >> [0] https://github.com/openjdk/jdk/pull/20420 > > Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V: delete the acquire argument in safepoint_poll since there is no use Thanks, looks good! ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25709#pullrequestreview-2920420668 From kbarrett at openjdk.org Thu Jun 12 10:30:30 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 12 Jun 2025 10:30:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v7] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:43:47 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static size_t available_memory(); >> static julong used_memory(); --> static size_t used_memory(); >> static julong free_memory(); --> static size_t free_memory(); >> static jlong total_swap_space(); --> static ssize_t total_swap_space(); >> static jlong free_swap_space(); --> static ssize_t free_swap_space(); >> static julong physical_memory(); --> static size_t physical_memory(); >> >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Added missed casts. > > D) a separate error reporting variable. It feels a bit overengineered, though. > > I think I disagree with this. The status quo makes it all too easy to miss that you need to check the return value for an error. IMHO, this is a growing ground for bugs when someone starts to use any of these APIs with realizing that we're mixing two values (and two types) into one return value. Just 2c, to reiterate that _I_ wouldn't mind a solution like this. I agree with @stefank here. Also, these are not functions that would be helped by C++17 `[[nodiscard]]`. Note that we already have a mechanism for propagating errors in HotSpot, via CHECK/TRAPS. Not beautiful but it exists. This discussion makes me kind of wish we had something like [Boost.LEAF](https://boostorg.github.io/leaf/). (Though probably trimmed down a bit, since there are features we don't need. And having anything like that likely needed a more modern C++ than we had until relatively recently.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2966060885 From wenanjian at openjdk.org Thu Jun 12 10:47:39 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Thu, 12 Jun 2025 10:47:39 GMT Subject: RFR: 8359218: RISC-V: Only enable CRC32 intrinsic when AvoidUnalignedAccess == false In-Reply-To: References: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> Message-ID: On Wed, 11 Jun 2025 10:58:44 GMT, Feilong Jiang wrote: >> When test **Specjvm** in p550, we can find the compress test result shown below. >> >> >> before patch >> -XX:-UseCompactObjectHeaders >> Warmup (30s) begins: Wed Jun 11 16:10:18 CST 2025 >> Warmup (30s) ends: Wed Jun 11 16:10:53 CST 2025 >> Warmup (30s) result: 68.98 ops/m >> >> Iteration 1 (60s) begins: Wed Jun 11 16:10:53 CST 2025 >> Iteration 1 (60s) ends: Wed Jun 11 16:11:57 CST 2025 >> Iteration 1 (60s) result: 71.25 ops/m >> >> >> -XX:+UseCompactObjectHeaders >> Warmup (30s) begins: Wed Jun 11 16:13:03 CST 2025 >> Warmup (30s) ends: Wed Jun 11 16:13:42 CST 2025 >> Warmup (30s) result: 31.87 ops/m >> >> Iteration 1 (60s) begins: Wed Jun 11 16:13:42 CST 2025 >> Iteration 1 (60s) ends: Wed Jun 11 16:14:56 CST 2025 >> Iteration 1 (60s) result: 29.13 ops/m >> >> >> Add flamegraph Before the patch >> 1. With parmater -XX:-UseCompactObjectHeaders >> >> java -XX:-UseCompactObjectHeaders -XX:+UseParallelGC -XX:+AlwaysPreTouch -Xms8g -Xmx8g -jar SPECjvm2008.jar -ikv -ict -coe -crf 0 -bt 4 -wt 30s -it 1m -i 4 compress >> >> ![closeCompactObjectHeader](https://github.com/user-attachments/assets/cea7f230-799a-4e9c-a541-626dd485670f) >> >> >> 2. With parmater -XX:+UseCompactObjectHeaders >> >> java -XX:+UseCompactObjectHeaders -XX:+UseParallelGC -XX:+AlwaysPreTouch -Xms8g -Xmx8g -jar SPECjvm2008.jar -ikv -ict -coe -crf 0 -bt 4 -wt 30s -it 1m -i 4 compress >> >> ![openparm](https://github.com/user-attachments/assets/24e07f17-4cb9-459d-b50d-8d1ad6355ebf) >> >> >> The reason is that when enable compactObjectHeaders, the arraylist header turn to 4 Byte, but the copy instruction used in CRC intrinsic loop is 8 Byte one loop, which may reduce the performance when hardware is sensitive to unaligned access, we can close it when AvoidUnalignedAccesses is true >> >> after patch >> >> -XX:-UseCompactObjectHeaders >> Warmup (30s) begins: Wed Jun 11 16:23:22 CST 2025 >> Warmup (30s) ends: Wed Jun 11 16:23:57 CST 2025 >> Warmup (30s) result: 68.61 ops/m >> >> Iteration 1 (60s) begins: Wed Jun 11 16:23:57 CST 2025 >> Iteration 1 (60s) ends: Wed Jun 11 16:25:00 CST 2025 >> Iteration 1 (60s) result: 71.57 ops/m >> >> >> -XX:+UseCompactObjectHeaders >> Warmup (30s) begins: Wed Jun 11 16:25:28 CST 2025 >> Warmup (30s) ends: Wed Jun 11 16:26:03 CST 2025 >> Warmup (30s) result: 68.36 ops/m >> >> Iteration 1 (60s) begins: Wed Jun 11 16:26:03 CST 2025 >> Iteration 1 (60s) ends: Wed Jun 11 16:27:08 CST 2025 >> Iteration 1 (60s) result: 70.85 ops/m > > Looks fine. @feilongjiang @RealFYang Thanks for your reviews? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25743#issuecomment-2966099409 From wenanjian at openjdk.org Thu Jun 12 10:47:40 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Thu, 12 Jun 2025 10:47:40 GMT Subject: Integrated: 8359218: RISC-V: Only enable CRC32 intrinsic when AvoidUnalignedAccess == false In-Reply-To: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> References: <1Pfa6_yopqSW7HLaFAVY1xMAuMsbzeJMcKSVFOHYE_s=.2c61b131-5288-4dde-ab36-0f7c9ef8ccb3@github.com> Message-ID: On Wed, 11 Jun 2025 08:47:49 GMT, Anjian Wen wrote: > When test **Specjvm** in p550, we can find the compress test result shown below. > > > before patch > -XX:-UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:10:18 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:10:53 CST 2025 > Warmup (30s) result: 68.98 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:10:53 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:11:57 CST 2025 > Iteration 1 (60s) result: 71.25 ops/m > > > -XX:+UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:13:03 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:13:42 CST 2025 > Warmup (30s) result: 31.87 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:13:42 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:14:56 CST 2025 > Iteration 1 (60s) result: 29.13 ops/m > > > Add flamegraph Before the patch > 1. With parmater -XX:-UseCompactObjectHeaders > > java -XX:-UseCompactObjectHeaders -XX:+UseParallelGC -XX:+AlwaysPreTouch -Xms8g -Xmx8g -jar SPECjvm2008.jar -ikv -ict -coe -crf 0 -bt 4 -wt 30s -it 1m -i 4 compress > > ![closeCompactObjectHeader](https://github.com/user-attachments/assets/cea7f230-799a-4e9c-a541-626dd485670f) > > > 2. With parmater -XX:+UseCompactObjectHeaders > > java -XX:+UseCompactObjectHeaders -XX:+UseParallelGC -XX:+AlwaysPreTouch -Xms8g -Xmx8g -jar SPECjvm2008.jar -ikv -ict -coe -crf 0 -bt 4 -wt 30s -it 1m -i 4 compress > > ![openparm](https://github.com/user-attachments/assets/24e07f17-4cb9-459d-b50d-8d1ad6355ebf) > > > The reason is that when enable compactObjectHeaders, the arraylist header turn to 4 Byte, but the copy instruction used in CRC intrinsic loop is 8 Byte one loop, which may reduce the performance when hardware is sensitive to unaligned access, we can close it when AvoidUnalignedAccesses is true > > after patch > > -XX:-UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:23:22 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:23:57 CST 2025 > Warmup (30s) result: 68.61 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:23:57 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:25:00 CST 2025 > Iteration 1 (60s) result: 71.57 ops/m > > > -XX:+UseCompactObjectHeaders > Warmup (30s) begins: Wed Jun 11 16:25:28 CST 2025 > Warmup (30s) ends: Wed Jun 11 16:26:03 CST 2025 > Warmup (30s) result: 68.36 ops/m > > Iteration 1 (60s) begins: Wed Jun 11 16:26:03 CST 2025 > Iteration 1 (60s) ends: Wed Jun 11 16:27:08 CST 2025 > Iteration 1 (60s) result: 70.85 ops/m This pull request has now been integrated. Changeset: 65e63b6a Author: Anjian Wen Committer: Feilong Jiang URL: https://git.openjdk.org/jdk/commit/65e63b6ab4241fc9d683e2ffa5bfe6e1a30059b6 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod 8359218: RISC-V: Only enable CRC32 intrinsic when AvoidUnalignedAccess == false Reviewed-by: fyang, fjiang ------------- PR: https://git.openjdk.org/jdk/pull/25743 From dnsimon at openjdk.org Thu Jun 12 10:52:38 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 12 Jun 2025 10:52:38 GMT Subject: RFR: 8359293: Make TestNoNULL extensible Message-ID: There are some closed markdown files that contain "NULL". These should be excluded from checking by TestNoNULL. This PR makes TestNoNull extensible in terms of exclusions with the following system properties: * `excludedTestExtensions` - command separated list of file extensions (e.g. `-DexcludedTestExtensions=.md,.txt`) * `sourceExclusions` - command separated list of file paths relative to repo root (e.g. `-DsourceExclusions=src/hotspot/share/prims/jvmti.xml`) * `testExclusions` - command separated list of file paths relative to repo root (e.g. `-DtestExclusions=test/hotspot/jtreg/vmTestbase/nsk/share/jvmti/README`) ------------- Commit messages: - made TestNoNULL extensible in terms of exclusions Changes: https://git.openjdk.org/jdk/pull/25777/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25777&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359293 Stats: 19 lines in 1 file changed: 14 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25777.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25777/head:pull/25777 PR: https://git.openjdk.org/jdk/pull/25777 From fyang at openjdk.org Thu Jun 12 11:09:27 2025 From: fyang at openjdk.org (Fei Yang) Date: Thu, 12 Jun 2025 11:09:27 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 03:22:10 GMT, Anjian Wen wrote: >> Acquire fence removal in safepoint_poll >> >> At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. >> >> Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in >> JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence in safepoint_poll. >> >> [0] https://github.com/openjdk/jdk/pull/20420 > > Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V: delete the acquire argument in safepoint_poll since there is no use LGTM. FYI: I see another pending PR (https://github.com/openjdk/jdk/pull/25211) which handles the rest for aarch64. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25709#pullrequestreview-2920585479 From duke at openjdk.org Thu Jun 12 11:28:35 2025 From: duke at openjdk.org (Jonas Norlinder) Date: Thu, 12 Jun 2025 11:28:35 GMT Subject: Withdrawn: 8359110: Log accumulated GC and process CPU time upon VM exit In-Reply-To: References: Message-ID: <24ib50kFQIUcwJ1BlMmQPdDDZI2wDSV0aHdR1_ratQE=.7c283605-89a7-491b-802b-86958b78e801@github.com> On Tue, 10 Jun 2025 13:24:37 GMT, Jonas Norlinder wrote: > Add support to log CPU cost for GC during VM exit with `-Xlog:gc`. > > > [1.500s][info ][gc] GC CPU cost: 1.75% > > > Additionally, detailed information may be retrieved with `-Xlog:gc=trace` > > > [1.500s][trace][gc] Process CPU time: 4.945370s > [1.500s][trace][gc] GC CPU time: 0.086382s > [1.500s][info ][gc] GC CPU cost: 1.75% This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/25724 From thartmann at openjdk.org Thu Jun 12 11:41:54 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 12 Jun 2025 11:41:54 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: References: Message-ID: > I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 > > But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 > > However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. > > I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. > > I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. > > @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? > > Thanks, > Tobias Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: Improved assert message ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25751/files - new: https://git.openjdk.org/jdk/pull/25751/files/ec817585..8ac8fcd0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25751&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25751&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25751/head:pull/25751 PR: https://git.openjdk.org/jdk/pull/25751 From thartmann at openjdk.org Thu Jun 12 11:41:54 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 12 Jun 2025 11:41:54 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 19:18:34 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/opto/block.cpp line 41: >> >>> 39: >>> 40: void Block_Array::grow(uint i) { >>> 41: assert(i >= Max(), "must be an overflow"); >> >> Assert message here is misleading: it is more likely someone had called `grow` when they intended `maybe_grow`. See how it is done elsewhere: >> >> >> void Node_Array::grow(uint i) { >> _nesting.check(_a); // Check if a potential reallocation in the arena is safe >> assert(i >= _max, "Should have been checked before, use maybe_grow?"); > > Speaking of, we should probably move `_nesting.check(_a);` to `Node_Array::maybe_grow` as well. > Assert message here is misleading Yes, good point. I had basically reverted to before [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999) but your assert message is better. Fixed. > Speaking of, we should probably move _nesting.check(_a); to Node_Array::maybe_grow as well. Right, I did that already. See changes in `src/hotspot/share/opto/node.hpp`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25751#discussion_r2142481527 From thartmann at openjdk.org Thu Jun 12 11:45:33 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 12 Jun 2025 11:45:33 GMT Subject: RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 12:57:47 GMT, Marc Chevalier wrote: > So, I suppose one shouldn't use grow directly, and you indeed changed Block_Array::map this way. But what about derived classes? grow is only protected in most (all?) of these classes, so some derived classes could call it. Is it worth making it private? Or maybe calling grow directly is not so wrong? I think it's still completely reasonable to directly call `Node_Array::grow(uint i)`. For example, when you know that you need a specific amount of storage and want to avoid incrementally growing the array when adding elements. So I'd say it's fine to leave `grow` as-is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25751#issuecomment-2966328874 From thartmann at openjdk.org Thu Jun 12 11:45:37 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 12 Jun 2025 11:45:37 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 11:41:54 GMT, Tobias Hartmann wrote: >> I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 >> >> But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 >> >> However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. >> >> I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. >> >> I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. >> >> @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? >> >> Thanks, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Improved assert message Thanks for your reviews, Marc, Vladimir and Aleksey! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25751#issuecomment-2966330055 From shade at openjdk.org Thu Jun 12 12:17:36 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 12 Jun 2025 12:17:36 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 11:41:54 GMT, Tobias Hartmann wrote: >> I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 >> >> But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 >> >> However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. >> >> I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. >> >> I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. >> >> @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? >> >> Thanks, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Improved assert message Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25751#pullrequestreview-2920889686 From jsjolen at openjdk.org Thu Jun 12 12:25:43 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 12 Jun 2025 12:25:43 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: <8QrMgCDug6daTsW-P_4wDGKMHHoXxsgs1LHmxX44J5s=.01069d54-cac2-47d2-bfc6-480b295285fd@github.com> On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update Thanks for the good and patient work. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2920919102 From coleenp at openjdk.org Thu Jun 12 12:25:44 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 12 Jun 2025 12:25:44 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: <9xa9l6ZpE8Pshkn0zMrS9G849EpAiWHqi1A5rSS2g7g=.8abd3caa-ef62-4397-a83a-df0011563647@github.com> On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update Tier1 on oracle platforms passsed, and tier1-7 have passed with yesterdays OpenJDK code and this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2966487473 From coleenp at openjdk.org Thu Jun 12 12:31:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 12 Jun 2025 12:31:54 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <-XuFmpo5Oj6x-mKlpgSseo5jJUZdY9GXo2Hm17pue0I=.4da01a2c-2c37-4bb3-a038-467746ad1582@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> <-XuFmpo5Oj6x-mKlpgSseo5jJUZdY9GXo2Hm17pue0I=.4da01a2c-2c37-4bb3-a038-467746ad1582@github.com> Message-ID: On Wed, 11 Jun 2025 16:35:01 GMT, Radim Vansa wrote: >> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: >> >> Copyright update > > Thank you! I'll mark this for integration and I would appreciate if I can get your sponsorship. @rvansa thank you for your work on this change, fixing the performance regression in an understandable way and answering all comments and questions. I think it's ready to go. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2966506671 From rvansa at openjdk.org Thu Jun 12 12:31:55 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 12 Jun 2025 12:31:55 GMT Subject: Integrated: 8352075: Perf regression accessing fields In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Thu, 24 Apr 2025 10:37:56 GMT, Radim Vansa wrote: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s This pull request has now been integrated. Changeset: e18277b4 Author: Radim Vansa Committer: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604 Stats: 924 lines in 18 files changed: 854 ins; 20 del; 50 mod 8352075: Perf regression accessing fields Reviewed-by: coleenp, iklam, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/24847 From coleenp at openjdk.org Thu Jun 12 12:45:16 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 12 Jun 2025 12:45:16 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory Message-ID: This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. Tested with tier1-4, 5-7. ------------- Commit messages: - 8305626: Kitchensink7D.java failing because of native memory exhausting with ZGC Changes: https://git.openjdk.org/jdk/pull/25267/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8268406 Stats: 596 lines in 14 files changed: 315 ins; 223 del; 58 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From mdoerr at openjdk.org Thu Jun 12 12:53:31 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 12 Jun 2025 12:53:31 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v2] In-Reply-To: <8oM2Y0kAMRN6wxtjAmpXDWTcHDZ6gPrNM-8PPtukwAA=.dde201dd-8c2a-4a16-96e5-ca92604e6edd@github.com> References: <8oM2Y0kAMRN6wxtjAmpXDWTcHDZ6gPrNM-8PPtukwAA=.dde201dd-8c2a-4a16-96e5-ca92604e6edd@github.com> Message-ID: On Thu, 12 Jun 2025 02:01:47 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with two additional commits since the last revision: > > - ... and stale code > - removed stale comment I appreciate this PR. I like using the nmethod entry barrier and getting rid of not-entrant patching. I have some change requests. In addition, we can get rid of much more code. I've put my proposal in a Commit: https://github.com/TheRealMDoerr/jdk/commit/4aed569dc353c254a2f4de2d387208a0a1323990 Please take a look! src/hotspot/share/gc/shared/barrierSetNMethod.cpp line 246: > 244: ConditionalMutexLocker ml(NMethodEntryBarrier_lock, !NMethodEntryBarrier_lock->owned_by_self(), Mutex::_no_safepoint_check_flag); > 245: int value = guard_value(nm) | not_entrant; > 246: set_guard_value(nm, value); Same here. src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp line 153: > 151: // Code cache unloading needs to know about on-stack nmethods. Arm the nmethods to get > 152: // mark_as_maybe_on_stack() callbacks when they are used again. > 153: _bs->arm(nm); This breaks the Shenandoah build. `arm` is private. src/hotspot/share/gc/z/zBarrierSetNMethod.cpp line 116: > 114: value |= not_entrant; > 115: } > 116: set_guard_value(nm, value); We can avoid redundant patching since we have the lock: Only update if the value is not already there. This happens quite often. ------------- Changes requested by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2920990811 PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2142655962 PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2142641974 PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2142654627 From adinn at openjdk.org Thu Jun 12 13:03:47 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 12 Jun 2025 13:03:47 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update Nice work, Radim! Thanks for persevering with this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2966624330 From duke at openjdk.org Thu Jun 12 13:50:48 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 12 Jun 2025 13:50:48 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v8] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static size_t available_memory(); > static julong used_memory(); --> static size_t used_memory(); > static julong free_memory(); --> static size_t free_memory(); > static jlong total_swap_space(); --> static ssize_t total_swap_space(); > static jlong free_swap_space(); --> static ssize_t free_swap_space(); > static julong physical_memory(); --> static size_t physical_memory(); > > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: - 8357086: Changed returm type to struct. - 8357086: Return size_t from swap mem funcs, added checks. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/2929f720..849d5466 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=06-07 Stats: 200 lines in 20 files changed: 88 ins; 1 del; 111 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From kvn at openjdk.org Thu Jun 12 13:52:32 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 12 Jun 2025 13:52:32 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: <_16KBkO-TmmTuaWG8l9ekstEdIP9BKsyXip0H0u_NHs=.3b1479e2-a0b7-41b1-a72a-12d217a3d628@github.com> On Thu, 12 Jun 2025 04:44:24 GMT, Vladimir Kozlov wrote: >> Thanks to @shipilev for catching the issue. >> >> [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. >> >> This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. >> >> We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). >> >> We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. >> >> The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. >> >> I also did some cleanup to match `leyden/premain` branch for easy merges. >> >> Tested hs-tier1-6, hs-tier1-rt, stress, xcomp > > I found that `StubRoutines::_fence_entry` from initial stubs is used by [OrderAccess::fence()](https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/windows_x86/orderAccess_windows_x86.hpp#L50) on Windows-x64 and only. And `OrderAccess::fence()` is used by GC worked threads which are started by `universe_init()`. > > I am work on fixing it. I hope I don't need to move all initial stubs. May be 'pre-initial` stubs for this? > @vnkozlov Are you suggesting moving the fence stubs to a separate StubGen preinitial blob created before the initial blob? That should be relatively easy to do. > > If you want me to push that change first in a separate PR I will be happy to do so. Yes, please. There could be other stubs we need very early which we can exclude from AOTing. May be call it `pre-universe` to be clear. My only concern is the fence stub is used only by windows-x86. Introducing whole new stubs type for that is overkill IMHO. But on other hand, we may need such type later for other new stubs. Or we find later that some initial stubs still be needed before `universe_init`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2966819886 From duke at openjdk.org Thu Jun 12 14:01:14 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 12 Jun 2025 14:01:14 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static size_t available_memory(); > static julong used_memory(); --> static size_t used_memory(); > static julong free_memory(); --> static size_t free_memory(); > static jlong total_swap_space(); --> static ssize_t total_swap_space(); > static jlong free_swap_space(); --> static ssize_t free_swap_space(); > static julong physical_memory(); --> static size_t physical_memory(); > > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: - 8357086: Fixed merge conflict. - 8357086: Changed returm type to struct. - 8357086: Return size_t from swap mem funcs, added checks. - 8357086: Added missed casts. - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs - 8357086: Fixed spaces in formatting in gc-related code. - 8357086: Fixed formatting. - 8357086: Addressed reviewer's comments. - 8357086: More work. - ... and 7 more: https://git.openjdk.org/jdk/compare/e18277b4...dd9275ca ------------- Changes: https://git.openjdk.org/jdk/pull/25450/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=08 Stats: 247 lines in 22 files changed: 88 ins; 1 del; 158 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Thu Jun 12 14:13:31 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 12 Jun 2025 14:13:31 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v4] In-Reply-To: References: <5jpQGssAmQuPcUPgK5wbXQjpBXXCjyY2yiXlNdD-tsI=.886a2462-65f5-4ac8-990e-4a92a2d362d8@github.com> Message-ID: <07X7fYvr39DepTDPvK54FK-dyib3stth8r9WmaVponE=.dfbcfc57-0fe1-410c-af20-563147aff7a4@github.com> On Wed, 11 Jun 2025 12:23:13 GMT, Kim Barrett wrote: > I prefer the dedicated struct approach, as it > provides meaningful names. I now refactored the code to return a struct instead. > This may have been more effort than I envisioned when creating this issue, sorry for that. I would love it if we could unify these behaviours. I tried to address all the issues you pointed out. Now all methods return the same type, and on all platforms including Windows there is a check. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2966983553 From epeter at openjdk.org Thu Jun 12 14:21:49 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 14:21:49 GMT Subject: RFR: 8347273: C2: VerifyIterativeGVN for Ideal and Identity [v10] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:08:25 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 79 additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8347273-verify-IGVN-Ideal-Identity >> - update comments for Christian >> - Apply suggestions from code review >> >> Co-authored-by: Christian Hagedorn >> - reorder flags for Christian >> - max_modes >> - use stringStream instead of ttyLocker >> - assert(false) for Christian >> - rename for Christian >> - Update src/hotspot/share/opto/phaseX.cpp >> >> Co-authored-by: Manuel H?ssig >> - review suggestions, and handled a few more edge cases >> - ... and 69 more: https://git.openjdk.org/jdk/compare/78bf9941...d9546d87 > > Update looks good, thanks! @chhagedorn @mhaessig Thank you for the reviews and all the helpful suggestions :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/22970#issuecomment-2967020432 From epeter at openjdk.org Thu Jun 12 14:21:49 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Jun 2025 14:21:49 GMT Subject: Integrated: 8347273: C2: VerifyIterativeGVN for Ideal and Identity In-Reply-To: References: Message-ID: <4DpljUxScfOlta-_yDUbhiFPO3FPb7LkYhJcNjazwJ4=.b6ca86ff-551d-4f61-8fcf-322a3bff8464@github.com> On Wed, 8 Jan 2025 14:43:40 GMT, Emanuel Peter wrote: > **Past Work** > With https://github.com/openjdk/jdk/pull/11775 / [JDK-8298952](https://bugs.openjdk.org/browse/JDK-8298952) we added `Node::Value` verification. > > **This PR** > I'm now adding verification for `Ideal` and `Identity`. I'm adding two bits to the flag `VerifyIterativeGVN`. > > I found many many node types that hit my verification assert, i.e. that could still be optimized after IGVN is over, just because these nodes were not put on the worklist any more. > > My approach was to aggressively bail-out for all nodes that had an issue. This way, we can address one by one in follow-up RFEs. For many, I did some initial assessment, and left some comments about what issues I encountered. > > **Future Work:** > In many cases, the issue is just a missing notification when inputs of inputs are changed. These would be good starter tasks. But there are probably also more complicated cases. And there are surely cases where verification will be impossible, because it is possible that the Idea / Identity optimizations traverse longer paths, and we cannot expect that notification makes it down that path. For those cases, we will have to leave the exception and document it well. > > I filed: > [JDK-8359103](https://bugs.openjdk.org/browse/JDK-8359103) C2 VerifyIterativeGVN: Umbrella for extending Ideal and Identity verification (JDK-8347273) > (We can file subtasks for the nodes we want to fix. I don't want to file them all now, but we should file them as we are investigating, so that there is no duplicate work.) > > Testing passed tier1-3, with extra timeout factor 20. This pull request has now been integrated. Changeset: dd688290 Author: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/dd68829017c3adea4068d5311cab3fbef87b9577 Stats: 925 lines in 5 files changed: 900 ins; 0 del; 25 mod 8347273: C2: VerifyIterativeGVN for Ideal and Identity Reviewed-by: chagedorn, mhaessig ------------- PR: https://git.openjdk.org/jdk/pull/22970 From stefank at openjdk.org Thu Jun 12 15:05:30 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 12 Jun 2025 15:05:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v5] In-Reply-To: References: <12CGSySmAQzzQVoAa8EVIvQo2AxbQIWjuSPcKYOtqDg=.48c3c4f1-a770-455d-bb08-c48ba34c7018@github.com> Message-ID: <8aJcbmNyySY5Oy-NBttbyyPyzmmbjZmWIoJbl4-7N2k=.43949931-07a9-470a-8881-cb6758068c7d@github.com> On Thu, 12 Jun 2025 08:39:30 GMT, Anton Artemov wrote: >> Hmm. Isn't 257 redundant because we already check for this on line 241 and the code between should never set the `avail_mem` to `-1`. Maybe this code needs some extra scrutiny as well (as a follow-up) > > Thanks for spotting this. Line 243 definitely needs treatment as well as line 258. Addressed in the latest commit. > > Line 257 does not look redundant to me. `avail_mem` could be set to large error value (cast from -1) on line 240, then lines between 246 and 256 can actually left untouched if something went wrong with reading meminfo file. Then we need to check if we still have an error value in `avail_mem`, and if so, use the one returned by `free_memory()`. I agree. (Though I still think this method could be improved to just use one -1 check. Obviously not in this PR). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2143007106 From aph at openjdk.org Thu Jun 12 15:12:29 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 12 Jun 2025 15:12:29 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:43:09 GMT, Mikhail Ablakatov wrote: >> src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp line 3: >> >>> 1: /* >>> 2: * Copyright (c) 1997, 2025, Oracle and/or its affiliates. All rights reserved. >>> 3: * Copyright (c) 2014, 2108, Red Hat Inc. All rights reserved. >> >> Suggestion: >> >> * Copyright (c) 2014, 2025, Red Hat Inc. All rights reserved. > > JIC, the patch doesn't touch this line. Git blame indicates the last time Red Hat's copyright was updated is 2018. Should I replace it with `2018` instead of `2025`? Well, I (sort of) contributed one line to this PR, so it's not totally unjustified to update this Copyright. ;-) It'd be nice to fix the typo, but never mind if you don't want to. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2143021717 From stefank at openjdk.org Thu Jun 12 15:18:36 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 12 Jun 2025 15:18:36 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 14:01:14 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static MemRes available_memory(); >> static julong used_memory(); --> static MemRes used_memory(); >> static julong free_memory(); --> static MemRes free_memory(); >> static jlong total_swap_space(); --> static MemRes total_swap_space(); >> static jlong free_swap_space(); --> static MemRes free_swap_space(); >> static julong physical_memory(); --> static MemRes physical_memory(); >> >> >> `MemRes` is a struct containing a pair of values, `size_t val` to carry the return value, `int err` to carry the error if any. Currently, in case of error the latter is set to -1. >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: > > - 8357086: Fixed merge conflict. > - 8357086: Changed returm type to struct. > - 8357086: Return size_t from swap mem funcs, added checks. > - 8357086: Added missed casts. > - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t > - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs > - 8357086: Fixed spaces in formatting in gc-related code. > - 8357086: Fixed formatting. > - 8357086: Addressed reviewer's comments. > - 8357086: More work. > - ... and 7 more: https://git.openjdk.org/jdk/compare/e18277b4...dd9275ca FWIW, I think that the `ssize_t` was a good first step and the `MemRes` was an experiment that would be interesting to see if how that panned out, but I didn't expect to see it in this PR. It tried to convey that in an earlier comment. I'm leaving it up to the other involved reviewers to decide if we should go with the MemRes change in this PR. About the MemRes change. I think adding the MemRes instance and then initializing it later makes the code messier: static MemRes host_free_swap() { MemRes res; struct sysinfo si; int ret = sysinfo(&si); if (ret != 0) { res.err = -1; return res; } res.val = static_cast(si.freeswap * si.mem_unit); return res; } It would be nice if the code could be something like this: static MemRes host_free_swap() { struct sysinfo si; int ret = sysinfo(&si); if (ret != 0) { return MemRes(0, -1); } return MemRes(static_cast(si.freeswap * si.mem_unit), 0) } I think you could even write this if MemRes didn't have any constructors (or maybe it can with constructors?): static MemRes host_free_swap() { struct sysinfo si; int ret = sysinfo(&si); if (ret != 0) { return {0, -1}; } return {static_cast(si.freeswap * si.mem_unit), 0}; } ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2967252229 From stuefe at openjdk.org Thu Jun 12 15:18:35 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 12 Jun 2025 15:18:35 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 14:01:14 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static MemRes available_memory(); >> static julong used_memory(); --> static MemRes used_memory(); >> static julong free_memory(); --> static MemRes free_memory(); >> static jlong total_swap_space(); --> static MemRes total_swap_space(); >> static jlong free_swap_space(); --> static MemRes free_swap_space(); >> static julong physical_memory(); --> static MemRes physical_memory(); >> >> >> `MemRes` is a struct containing a pair of values, `size_t val` to carry the return value, `int err` to carry the error if any. Currently, in case of error the latter is set to -1. >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: > > - 8357086: Fixed merge conflict. > - 8357086: Changed returm type to struct. > - 8357086: Return size_t from swap mem funcs, added checks. > - 8357086: Added missed casts. > - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t > - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs > - 8357086: Fixed spaces in formatting in gc-related code. > - 8357086: Fixed formatting. > - 8357086: Addressed reviewer's comments. > - 8357086: More work. > - ... and 7 more: https://git.openjdk.org/jdk/compare/e18277b4...dd9275ca The return of a structure is an improvement over ssize_t. But I fear it does not go far enough in addressing @stefank 's problem, which is that people may find it too easy to ignore error returns. This PR shows just that by exchanging xx with xx.val and and mostly ignoring xx.err. I expect this will be a pattern too easy to copy. An alternative would be this: bool available_memory(size_t* out); Follows hotspot conventions, which is to return `bool` as error flag, allows for a nicer flow at the call site, and forces the caller to think about the error. if (os:: available_memory(&s)) { // ok } else { // need to think } ... If we define the function contract as "in case of error, leave the input size alone", the caller can pre-populate it with a reasonable default value: size_t s = reasonable_default; os:: available_memory(&s); // use s for whatever ... or maybe simplify the coding, e.g. when printing values: size_t s = 0; // aka don't know os::free_swap_space(&s); st->print_cr("xxx: %zu", s); ... ----- But a different question is which functions need this. In some cases we may be better off just ending the JVM with a fatal error right away in the function itself: - if there is no way to handle the error meaningfully. Arguably os::physical_memory() is such a case: if that does not work, I doubt the JVM will live beyond initialization. - if that function failing is a sign that something is very off. E.g. I think we can require that /proc/meminfo exists and is readable. If not, something with this Linux box is badly broken and an administrator should look at it. Returning errors makes sense in cases for cases that can actually happen at runtime: - if the customer machine misses certain optional capabilities; e.g. RssAnon in /proc/pid/status on Linux, which depends on kernel version - possibly when getting the information can stall or fail due to unknown runtime conditions. Not so sure about this. E.g. for EAGAIN, we could retry, but we never do that today. --- I am torn. I know fixing the error returns is a much larger scope that this PR originally aimed for, and don't want to hold it off. I just feel that adding an error report channel that maybe is not needed (available_memory) is a step sideways, especially if we still mostly ignore the error. But if others want this PR to proceed, I'm fine with it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2967252113 From mablakatov at openjdk.org Thu Jun 12 15:30:48 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 12 Jun 2025 15:30:48 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: References: Message-ID: > In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. > > This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. > > Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: > > | Metric | Before | After | Difference | > |-------------|---------------|---------------|------------| > | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | > | | Sum: 6653848 | Sum: 6616344 | -0.56% | > | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | > | | Sum: 364376 | Sum: 308552 | -15.33% | > > Full jtreg passed on AArch64. Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: cleanup: update a copyright notice Co-authored-by: Andrew Haley ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25702/files - new: https://git.openjdk.org/jdk/pull/25702/files/7ef1c4ae..2eae70ef Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25702&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25702&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25702.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25702/head:pull/25702 PR: https://git.openjdk.org/jdk/pull/25702 From mablakatov at openjdk.org Thu Jun 12 15:30:48 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Thu, 12 Jun 2025 15:30:48 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 15:10:21 GMT, Andrew Haley wrote: >> JIC, the patch doesn't touch this line. Git blame indicates the last time Red Hat's copyright was updated is 2018. Should I replace it with `2018` instead of `2025`? > > Well, I (sort of) contributed one line to this PR, so it's not totally unjustified to update this Copyright. ;-) It'd be nice to fix the typo, but never mind if you don't want to. I absolutely don't mind to commit the suggestion (and don't see an issue as someone who isn't affiliated with Red Hat since it has your authorship anyway), just wanted to clarify the intent. Done! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25702#discussion_r2143064439 From aph at openjdk.org Thu Jun 12 15:46:29 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 12 Jun 2025 15:46:29 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 15:30:48 GMT, Mikhail Ablakatov wrote: >> In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. >> >> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. >> >> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: >> >> | Metric | Before | After | Difference | >> |-------------|---------------|---------------|------------| >> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | >> | | Sum: 6653848 | Sum: 6616344 | -0.56% | >> | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | >> | | Sum: 364376 | Sum: 308552 | -15.33% | >> >> Full jtreg passed on AArch64. > > Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: > > cleanup: update a copyright notice > > Co-authored-by: Andrew Haley Thanks. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25702#pullrequestreview-2921741473 From aph at openjdk.org Thu Jun 12 15:46:29 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 12 Jun 2025 15:46:29 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: References: Message-ID: <_mCh8L-9aT5OkbrlrMTWY7cwlzLGfjb1tA310PNC--8=.cf36e768-d129-4519-8f81-1c1660bfef61@github.com> On Thu, 12 Jun 2025 15:41:38 GMT, Andrew Haley wrote: >> Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: >> >> cleanup: update a copyright notice >> >> Co-authored-by: Andrew Haley > > Thanks. > The error in java/lang/Thread/virtual/stress/GetStackTraceALotWhenBlocking.java#id0 looks similar to what has been previously reported here: https://bugs.openjdk.org/browse/JDK-8344577 . @theRealAph , do you think the patch may cause the error? Or should I open a similar JBS ticket to report it? That bug is macOS/x86. So, is the failure you're seeing repeatable? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25702#issuecomment-2967352818 From coleenp at openjdk.org Thu Jun 12 15:57:34 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 12 Jun 2025 15:57:34 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory In-Reply-To: References: Message-ID: On Fri, 16 May 2025 12:18:42 GMT, Coleen Phillimore wrote: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Also, this was @fisk 's idea. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2967389860 From sroy at openjdk.org Thu Jun 12 16:07:09 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Thu, 12 Jun 2025 16:07:09 GMT Subject: RFR: JDK-8348574 : Simplify c1/c2_globals inclusions Message-ID: JBS Issue : [JDK-8348574](https://bugs.openjdk.org/browse/JDK-8348574) c1_globals.hpp includes c1_globals_pd.hpp. c1_globals_pd.hpp includes the corresponding CPU_HEADER and OS_HEADER files. All of the c1_globals_.hpp files are essentially identical and basically empty. (They just include globalDefinitions.hpp and macros.hpp, and provide nothing additional.) This could be simplified by having c1_globals.hpp do the CPU_HEADER inclusion directly, and remove c1_globals_pd.hpp and all c1_globals_.hpp files. Even if there are some non-vacuous c1_globals_.hpp files in the future, c1_globals_pd.hpp seems unwarranted; just add the OS_HEADER include directly in c1_globals.hpp. The c1_globals_pd.hpp files really don't seem worth the extra indirection. Similarly for c2_globals.hpp &etc. ------------- Commit messages: - Update c2_globals.hpp - c1 header - global headers Changes: https://git.openjdk.org/jdk/pull/25773/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25773&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8348574 Stats: 366 lines in 13 files changed: 2 ins; 360 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25773.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25773/head:pull/25773 PR: https://git.openjdk.org/jdk/pull/25773 From kbarrett at openjdk.org Thu Jun 12 16:39:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 12 Jun 2025 16:39:28 GMT Subject: RFR: 8359293: Make TestNoNULL extensible In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 09:49:14 GMT, Doug Simon wrote: > There are some closed markdown files that contain "NULL". These should be excluded from checking by TestNoNULL. This PR makes TestNoNull extensible in terms of exclusions with the following system properties: > > * `excludedTestExtensions` - command separated list of file extensions (e.g. `-DexcludedTestExtensions=.md,.txt`) > * `sourceExclusions` - command separated list of file paths relative to repo root (e.g. `-DsourceExclusions=src/hotspot/share/prims/jvmti.xml`) > * `testExclusions` - command separated list of file paths relative to repo root (e.g. `-DtestExclusions=test/hotspot/jtreg/vmTestbase/nsk/share/jvmti/README`) Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25777#pullrequestreview-2921920495 From mhaessig at openjdk.org Thu Jun 12 16:53:30 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 12 Jun 2025 16:53:30 GMT Subject: RFR: JDK-8348574 : Simplify c1/c2_globals inclusions In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:32:56 GMT, Suchismith Roy wrote: > JBS Issue : [JDK-8348574](https://bugs.openjdk.org/browse/JDK-8348574) > > c1_globals.hpp includes c1_globals_pd.hpp. c1_globals_pd.hpp includes the corresponding CPU_HEADER and OS_HEADER files. All of the c1_globals_.hpp files are essentially identical and basically empty. (They just include globalDefinitions.hpp and macros.hpp, and provide nothing additional.) > > This could be simplified by having c1_globals.hpp do the CPU_HEADER inclusion directly, and remove c1_globals_pd.hpp and all c1_globals_.hpp files. > > Even if there are some non-vacuous c1_globals_.hpp files in the future, c1_globals_pd.hpp seems unwarranted; just add the OS_HEADER include directly in c1_globals.hpp. The c1_globals_pd.hpp files really don't seem worth the extra indirection. > > Similarly for c2_globals.hpp &etc. Thank you for working on this, @suchismith1993! Good to see the includes cleaned up. This looks good to me, but I nevertheless kicked off some testing on our side. I will let you know how it went when the results are in. ------------- PR Review: https://git.openjdk.org/jdk/pull/25773#pullrequestreview-2921960450 From adinn at openjdk.org Thu Jun 12 17:01:29 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 12 Jun 2025 17:01:29 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: <_16KBkO-TmmTuaWG8l9ekstEdIP9BKsyXip0H0u_NHs=.3b1479e2-a0b7-41b1-a72a-12d217a3d628@github.com> References: <_16KBkO-TmmTuaWG8l9ekstEdIP9BKsyXip0H0u_NHs=.3b1479e2-a0b7-41b1-a72a-12d217a3d628@github.com> Message-ID: On Thu, 12 Jun 2025 13:49:59 GMT, Vladimir Kozlov wrote: >> I found that `StubRoutines::_fence_entry` from initial stubs is used by [OrderAccess::fence()](https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/windows_x86/orderAccess_windows_x86.hpp#L50) on Windows-x64 and only. And `OrderAccess::fence()` is used by GC worked threads which are started by `universe_init()`. >> >> I am work on fixing it. I hope I don't need to move all initial stubs. May be 'pre-initial` stubs for this? > >> @vnkozlov Are you suggesting moving the fence stubs to a separate StubGen preinitial blob created before the initial blob? That should be relatively easy to do. >> >> If you want me to push that change first in a separate PR I will be happy to do so. > > Yes, please. There could be other stubs we need very early which we can exclude from AOTing. May be call it `pre-universe` to be clear. > > My only concern is the fence stub is used only by windows-x86. Introducing whole new stubs type for that is overkill IMHO. But on other hand, we may need such type later for other new stubs. Or we find later that some initial stubs still be needed before `universe_init`. @vnkozlov I have raised [JDK-8359373](https://bugs.openjdk.org/browse/JDK-8359373) and have a PR in progress. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2967586583 From adinn at openjdk.org Thu Jun 12 17:07:19 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 12 Jun 2025 17:07:19 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs Message-ID: This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. ------------- Commit messages: - 8359373: Split stubgen initial blob into pre and post-universe blobs Changes: https://git.openjdk.org/jdk/pull/25784/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25784&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359373 Stats: 147 lines in 17 files changed: 136 ins; 5 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25784.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25784/head:pull/25784 PR: https://git.openjdk.org/jdk/pull/25784 From adinn at openjdk.org Thu Jun 12 17:35:43 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 12 Jun 2025 17:35:43 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v2] In-Reply-To: References: Message-ID: > This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: locate x86 stub generation methods in class ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25784/files - new: https://git.openjdk.org/jdk/pull/25784/files/d7b59889..8593475b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25784&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25784&range=00-01 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25784.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25784/head:pull/25784 PR: https://git.openjdk.org/jdk/pull/25784 From kvn at openjdk.org Thu Jun 12 17:45:28 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 12 Jun 2025 17:45:28 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v2] In-Reply-To: References: Message-ID: <3auU0SnJvK6jQhkYUbdesmbV5sK1hG00RRLOdZpaMi8=.b9645715-e7bf-4e4f-98e9-e48bb9a0a452@github.com> On Thu, 12 Jun 2025 17:35:43 GMT, Andrew Dinn wrote: >> This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > locate x86 stub generation methods in class src/hotspot/cpu/zero/stubGenerator_zero.cpp line 182: > 180: > 181: void generate_preuniverse_stubs() { > 182: // preuniverse stubs are not needed for zero Zero has `_fence_entry` initialization. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25784#discussion_r2143311546 From kvn at openjdk.org Thu Jun 12 17:48:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 12 Jun 2025 17:48:29 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 17:35:43 GMT, Andrew Dinn wrote: >> This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > locate x86 stub generation methods in class Thank you @adinn for this change. I have only one comment. ------------- PR Review: https://git.openjdk.org/jdk/pull/25784#pullrequestreview-2922124839 From kbarrett at openjdk.org Thu Jun 12 17:53:36 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 12 Jun 2025 17:53:36 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 14:01:14 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static MemRes available_memory(); >> static julong used_memory(); --> static MemRes used_memory(); >> static julong free_memory(); --> static MemRes free_memory(); >> static jlong total_swap_space(); --> static MemRes total_swap_space(); >> static jlong free_swap_space(); --> static MemRes free_swap_space(); >> static julong physical_memory(); --> static MemRes physical_memory(); >> >> >> `MemRes` is a struct containing a pair of values, `size_t val` to carry the return value, `int err` to carry the error if any. Currently, in case of error the latter is set to -1. >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: > > - 8357086: Fixed merge conflict. > - 8357086: Changed returm type to struct. > - 8357086: Return size_t from swap mem funcs, added checks. > - 8357086: Added missed casts. > - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t > - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs > - 8357086: Fixed spaces in formatting in gc-related code. > - 8357086: Fixed formatting. > - 8357086: Addressed reviewer's comments. > - 8357086: More work. > - ... and 7 more: https://git.openjdk.org/jdk/compare/e18277b4...dd9275ca > FWIW, I think that the `ssize_t` was a good first step and the `MemRes` was an experiment that would be interesting to see if how that panned out, but I didn't expect to see it in this PR. It tried to convey that in an earlier comment. I'm leaving it up to the other involved reviewers to decide if we should go with the MemRes change in this PR. > > About the MemRes change. I think adding the MemRes instance and then initializing it later makes the code messier: > [...] > It would be nice if the code could be something like this: > [...] > I think you could even write this if MemRes didn't have any constructors (or maybe it can with constructors?): > [...] I agree with @stefank. Also, allocating the MemRes and filling it in later depends on NRVO for optimization (though it doesn't matter much here), which is optional and sometimes disabled by even fairly innocuous code, and remains so even with C++17. Constructing in the return will benefit from C++17 guaranteed copy elision. And yes, the brace initializion example works - C++14 6.3.3. It would currently be pretty unusual in HotSpot - I don't know what folks might think about that syntax. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2967724516 From adinn at openjdk.org Thu Jun 12 18:01:09 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 12 Jun 2025 18:01:09 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v3] In-Reply-To: References: Message-ID: > This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: ensure fence stub is set as part of preuniverse init on zero ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25784/files - new: https://git.openjdk.org/jdk/pull/25784/files/8593475b..99085bfc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25784&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25784&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25784.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25784/head:pull/25784 PR: https://git.openjdk.org/jdk/pull/25784 From adinn at openjdk.org Thu Jun 12 18:01:10 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 12 Jun 2025 18:01:10 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v2] In-Reply-To: <3auU0SnJvK6jQhkYUbdesmbV5sK1hG00RRLOdZpaMi8=.b9645715-e7bf-4e4f-98e9-e48bb9a0a452@github.com> References: <3auU0SnJvK6jQhkYUbdesmbV5sK1hG00RRLOdZpaMi8=.b9645715-e7bf-4e4f-98e9-e48bb9a0a452@github.com> Message-ID: On Thu, 12 Jun 2025 17:43:19 GMT, Vladimir Kozlov wrote: >> Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: >> >> locate x86 stub generation methods in class > > src/hotspot/cpu/zero/stubGenerator_zero.cpp line 182: > >> 180: >> 181: void generate_preuniverse_stubs() { >> 182: // preuniverse stubs are not needed for zero > > Zero has `_fence_entry` initialization. Ah yes, missed that. I pushed a fix n.b. I have still left the blob size as 0 in stubDeclarations_zero.hpp which means no blob will be created. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25784#discussion_r2143335009 From kvn at openjdk.org Thu Jun 12 18:17:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 12 Jun 2025 18:17:29 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 18:01:09 GMT, Andrew Dinn wrote: >> This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > ensure fence stub is set as part of preuniverse init on zero Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25784#pullrequestreview-2922195313 From kbarrett at openjdk.org Thu Jun 12 18:20:33 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 12 Jun 2025 18:20:33 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 14:01:14 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static MemRes available_memory(); >> static julong used_memory(); --> static MemRes used_memory(); >> static julong free_memory(); --> static MemRes free_memory(); >> static jlong total_swap_space(); --> static MemRes total_swap_space(); >> static jlong free_swap_space(); --> static MemRes free_swap_space(); >> static julong physical_memory(); --> static MemRes physical_memory(); >> >> >> `MemRes` is a struct containing a pair of values, `size_t val` to carry the return value, `int err` to carry the error if any. Currently, in case of error the latter is set to -1. >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: > > - 8357086: Fixed merge conflict. > - 8357086: Changed returm type to struct. > - 8357086: Return size_t from swap mem funcs, added checks. > - 8357086: Added missed casts. > - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t > - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs > - 8357086: Fixed spaces in formatting in gc-related code. > - 8357086: Fixed formatting. > - 8357086: Addressed reviewer's comments. > - 8357086: More work. > - ... and 7 more: https://git.openjdk.org/jdk/compare/e18277b4...dd9275ca Tons of MemErr's without examining the error part at all. I guess that's a consequence of the existing code just not bothering to check, which I guess needs to be fixed but not in this PR. src/hotspot/share/runtime/os.cpp line 2217: > 2215: #endif > 2216: res.val = os::physical_memory().val - os::available_memory().val; > 2217: res.err = MIN2(os::physical_memory().err, os::available_memory().err); Repeated calls shouldn't be assumed to return the same error status. And min seems like the wrong way to combine errors. I think this should get the result from `os::physical_memory` and inspect it for an error. If not an error then do the same for `os::available_memory`. If not an error then compute the result accordingly. src/hotspot/share/runtime/os.hpp line 151: > 149: struct MemRes { > 150: size_t val; > 151: int err; I'm not a big fan of abbreviations like this. My preference would be to at least spell out "value". Don't know what others might think. src/hotspot/share/runtime/os.hpp line 152: > 150: size_t val; > 151: int err; > 152: MemRes(size_t v, int e) : val(v), err(e) {} It might be that having the second argument be optional and default to 0 is nicer / more convenient. src/hotspot/share/runtime/os.hpp line 153: > 151: int err; > 152: MemRes(size_t v, int e) : val(v), err(e) {} > 153: MemRes(): val(0), err(0) {} Do we really need initialization in the default ctor? And if so, is default initialization to not-an-error what we want? ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25450#pullrequestreview-2922179889 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2143353799 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2143364103 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2143362397 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2143360927 From asmehra at openjdk.org Thu Jun 12 19:33:28 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 12 Jun 2025 19:33:28 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 23:08:44 GMT, Vladimir Kozlov wrote: > Thanks to @shipilev for catching the issue. > > [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. > > This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. > > We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). > > We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. > > The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. > > I also did some cleanup to match `leyden/premain` branch for easy merges. > > Tested hs-tier1-6, hs-tier1-rt, stress, xcomp src/hotspot/share/code/aotCodeCache.cpp line 101: > 99: // after compilationPolicy_init() but before codeCache_init(). > 100: // > 101: // 4. AOTCodeCache::initialize() is called during universe_init() @vnkozlov I wonder if we can move `AOTCodeCache::initialize()` after universe_init() and merge `AOTCodeCache::init2()` into `AOTCodeCache::initialize()`. Is there any reason for splitting initialization between `initialize()` and `init2`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25763#discussion_r2143486948 From dlong at openjdk.org Thu Jun 12 19:54:45 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 12 Jun 2025 19:54:45 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v3] In-Reply-To: References: Message-ID: > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request incrementally with six additional commits since the last revision: - more cleanup - more TheRealMDoerr suggestions - TheRealMDoerr suggestions - remove trailing space - Fix Shenandoah build - more cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25764/files - new: https://git.openjdk.org/jdk/pull/25764/files/0780d156..0950fc5e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=01-02 Stats: 330 lines in 25 files changed: 10 ins; 304 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From dlong at openjdk.org Thu Jun 12 19:57:28 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 12 Jun 2025 19:57:28 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v2] In-Reply-To: <8oM2Y0kAMRN6wxtjAmpXDWTcHDZ6gPrNM-8PPtukwAA=.dde201dd-8c2a-4a16-96e5-ca92604e6edd@github.com> References: <8oM2Y0kAMRN6wxtjAmpXDWTcHDZ6gPrNM-8PPtukwAA=.dde201dd-8c2a-4a16-96e5-ca92604e6edd@github.com> Message-ID: On Thu, 12 Jun 2025 02:01:47 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with two additional commits since the last revision: > > - ... and stale code > - removed stale comment Thanks @TheRealMDoer for the suggestions, I incorporated all of them, however for shenandoahCodeRoots.cpp I decided to make arm() public, so the caller code doesn't need to know about the magic value 0. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2967999465 From kvn at openjdk.org Thu Jun 12 20:00:30 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 12 Jun 2025 20:00:30 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 19:30:30 GMT, Ashutosh Mehra wrote: >> Thanks to @shipilev for catching the issue. >> >> [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. >> >> This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. >> >> We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). >> >> We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. >> >> The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. >> >> I also did some cleanup to match `leyden/premain` branch for easy merges. >> >> Tested hs-tier1-6, hs-tier1-rt, stress, xcomp > > src/hotspot/share/code/aotCodeCache.cpp line 101: > >> 99: // after compilationPolicy_init() but before codeCache_init(). >> 100: // >> 101: // 4. AOTCodeCache::initialize() is called during universe_init() > > @vnkozlov I wonder if we can move `AOTCodeCache::initialize()` after universe_init() and merge `AOTCodeCache::init2()` into `AOTCodeCache::initialize()`. Is there any reason for splitting initialization between `initialize()` and `init2`? We need to load/map AOT code region before we close file: [metaspaceShared.cpp#L2151](https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/cds/metaspaceShared.cpp#L2151) We also need to be able print AOT code cache information when `PrintSharedArchiveAndExit` flag is specified. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25763#discussion_r2143554033 From kvn at openjdk.org Thu Jun 12 20:02:28 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 12 Jun 2025 20:02:28 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 18:01:09 GMT, Andrew Dinn wrote: >> This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > ensure fence stub is set as part of preuniverse init on zero I submitted our testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25784#issuecomment-2968009109 From asmehra at openjdk.org Thu Jun 12 20:07:28 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 12 Jun 2025 20:07:28 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 19:57:29 GMT, Vladimir Kozlov wrote: > We need to load/map AOT code region before we close file right, I missed that part. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25763#discussion_r2143569134 From mdoerr at openjdk.org Thu Jun 12 20:08:31 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 12 Jun 2025 20:08:31 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v2] In-Reply-To: References: <8oM2Y0kAMRN6wxtjAmpXDWTcHDZ6gPrNM-8PPtukwAA=.dde201dd-8c2a-4a16-96e5-ca92604e6edd@github.com> Message-ID: <15r53QUpAANeu3oucJCrmg3BOqTksEQK24cppfHjQow=.b5848d6f-0bfe-43ce-a4d8-29d57d91e873@github.com> On Thu, 12 Jun 2025 19:54:28 GMT, Dean Long wrote: > Thanks @TheRealMDoer for the suggestions, I incorporated all of them, however for shenandoahCodeRoots.cpp I decided to make arm() public, so the caller code doesn't need to know about the magic value 0. Excellent! Thank you! I'll take another look over it and run tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2968024418 From syan at openjdk.org Fri Jun 13 02:35:38 2025 From: syan at openjdk.org (SendaoYan) Date: Fri, 13 Jun 2025 02:35:38 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 11:41:54 GMT, Tobias Hartmann wrote: >> I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 >> >> But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 >> >> However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. >> >> I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. >> >> I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. >> >> @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? >> >> Thanks, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Improved assert message test/hotspot/jtreg/compiler/arguments/TestOptoNodeListSize.java line 29: > 27: * @bug 8359200 > 28: * @key randomness > 29: * @requires vm.flagless & vm.compiler2.enabled & vm.debug == true Why this need `@requires vm.flagless`. Do other VM flags will cause this test fails. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25751#discussion_r2144093172 From syan at openjdk.org Fri Jun 13 02:44:28 2025 From: syan at openjdk.org (SendaoYan) Date: Fri, 13 Jun 2025 02:44:28 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: References: Message-ID: <9Knxr341D5X0pHiBThj2NDgQpb1pW09N2ehNfq4gQIg=.38ad1868-bc36-4965-b17d-b44c6783993a@github.com> On Fri, 13 Jun 2025 02:33:19 GMT, SendaoYan wrote: >> Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: >> >> Improved assert message > > test/hotspot/jtreg/compiler/arguments/TestOptoNodeListSize.java line 29: > >> 27: * @bug 8359200 >> 28: * @key randomness >> 29: * @requires vm.flagless & vm.compiler2.enabled & vm.debug == true > > Why this need `@requires vm.flagless`. Do other VM flags will cause this test fails. Sorry, the documemt from `createLimitedTestJavaProcessBuilder` say this function use with `@requires vm.flagless` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25751#discussion_r2144099595 From kbarrett at openjdk.org Fri Jun 13 06:29:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 13 Jun 2025 06:29:28 GMT Subject: RFR: JDK-8348574 : Simplify c1/c2_globals inclusions In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:32:56 GMT, Suchismith Roy wrote: > JBS Issue : [JDK-8348574](https://bugs.openjdk.org/browse/JDK-8348574) > > c1_globals.hpp includes c1_globals_pd.hpp. c1_globals_pd.hpp includes the corresponding CPU_HEADER and OS_HEADER files. All of the c1_globals_.hpp files are essentially identical and basically empty. (They just include globalDefinitions.hpp and macros.hpp, and provide nothing additional.) > > This could be simplified by having c1_globals.hpp do the CPU_HEADER inclusion directly, and remove c1_globals_pd.hpp and all c1_globals_.hpp files. > > Even if there are some non-vacuous c1_globals_.hpp files in the future, c1_globals_pd.hpp seems unwarranted; just add the OS_HEADER include directly in c1_globals.hpp. The c1_globals_pd.hpp files really don't seem worth the extra indirection. > > Similarly for c2_globals.hpp &etc. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25773#pullrequestreview-2923554527 From kbarrett at openjdk.org Fri Jun 13 07:04:46 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 13 Jun 2025 07:04:46 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t Message-ID: Please review this change that makes the various code cache/heap size options consistently be of type size_t. The shared declarations for these options were all uintx. These options all may have platform-defined values. Some of those platform-specific definitions were uintx, some were size_t, and some were intx(!). This change makes them all consistently size_t. More details in the first comment. Testing: mach5 tier1-6 GHA testing in-progress ------------- Commit messages: - fix whitebox access to code cache size configs - VMPageSizeConstraintFunc - CodeCacheMinBlockLength - CodeCacheExpansionSize - various CodeHeapSize options - CodeCacheMinimumUseSpace - Initial/ReservedCodeCacheSize - CodeCacheSegmentSize Changes: https://git.openjdk.org/jdk/pull/25791/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25791&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359227 Stats: 165 lines in 40 files changed: 3 ins; 0 del; 162 mod Patch: https://git.openjdk.org/jdk/pull/25791.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25791/head:pull/25791 PR: https://git.openjdk.org/jdk/pull/25791 From kbarrett at openjdk.org Fri Jun 13 07:04:46 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 13 Jun 2025 07:04:46 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 06:57:20 GMT, Kim Barrett wrote: > Please review this change that makes the various code cache/heap size options > consistently be of type size_t. > > The shared declarations for these options were all uintx. These options all > may have platform-defined values. Some of those platform-specific definitions > were uintx, some were size_t, and some were intx(!). This change makes them > all consistently size_t. > > More details in the first comment. > > Testing: mach5 tier1-6 > GHA testing in-progress In addition to adjusting the types, there are some code changes to deal with some inconsistencies, benefit from the change (mostly in the form of removal of some casts), or address issues discovered while examining uses of the options. The changes in the PR are broken up into a series of commits, mostly by option or a couple of closely related options. It might be easier to review parts of the change by looking at some of those individual commits, in order to more easily see the code changes related to specific options. One change in particular to note is the change in compilationPolicy.cpp. The calculation of max_count, depending on explicit option values and such, could potentially overflow its prior int type, effectively having a random value. There is still a possibility of a nonsense result if ReservedCodeCacheSize and CodeCacheMinimumUseSpace are poorly chosen. I'm leaving that pre-existing issue to the compiler team to deal with. The reason for my looking at these options in the first place is the incorrect format strings for error messages in CompilerConfig::check_args_consistency. They weren't previously noticed because the messages are printed using jio_fprintf. It might be these should be using UL warnings, but I'm leaving that to the compiler team to decide. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25791#issuecomment-2969303712 From duke at openjdk.org Fri Jun 13 07:36:30 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 13 Jun 2025 07:36:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: <1mV_rk0qtAWW7cX-uBHPB5l9GCfy9QsNC60WdKcflrA=.a93bf6cd-169c-4464-a5e4-d125fbcf4009@github.com> On Thu, 12 Jun 2025 17:50:58 GMT, Kim Barrett wrote: > FWIW, I think that the `ssize_t` was a good first step and the `MemRes` was an experiment that would be interesting to see if how that panned out, but I didn't expect to see it in this PR. It tried to convey that in an earlier comment. I'm leaving it up to the other involved reviewers to decide if we should go with the MemRes change in this PR. I understand that. Keeping `ssize_t` would not introduce consistency in the codebase, as on Windows `ssize_t` is `int64_t`, which is a signed type and, as pointed out by @tstuefe, it eats half of the possible range. I am not sure about other platforms, but I suspect that would not be the case there. Therefore I decided to rely on the known things such as `size_t `. But then, if we want to report more than one kind of error, it makes sense to keep the error code separately. > Tons of MemErr's without examining the error part at all. I guess that's a > consequence of the existing code just not bothering to check, which I guess > needs to be fixed but not in this PR. I agree, checking of error codes and making sure it is not easy to ignore them is to be done in a separate PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2969393149 From thartmann at openjdk.org Fri Jun 13 07:47:33 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 13 Jun 2025 07:47:33 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 11:41:54 GMT, Tobias Hartmann wrote: >> I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 >> >> But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: >> https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 >> >> However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. >> >> I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. >> >> I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. >> >> @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? >> >> Thanks, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Improved assert message Thanks again for the reviews. I'll integrate this now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25751#issuecomment-2969416570 From thartmann at openjdk.org Fri Jun 13 07:47:34 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 13 Jun 2025 07:47:34 GMT Subject: RFR: 8359200: Memory corruption in MStack::push [v2] In-Reply-To: <9Knxr341D5X0pHiBThj2NDgQpb1pW09N2ehNfq4gQIg=.38ad1868-bc36-4965-b17d-b44c6783993a@github.com> References: <9Knxr341D5X0pHiBThj2NDgQpb1pW09N2ehNfq4gQIg=.38ad1868-bc36-4965-b17d-b44c6783993a@github.com> Message-ID: On Fri, 13 Jun 2025 02:41:27 GMT, SendaoYan wrote: >> test/hotspot/jtreg/compiler/arguments/TestOptoNodeListSize.java line 29: >> >>> 27: * @bug 8359200 >>> 28: * @key randomness >>> 29: * @requires vm.flagless & vm.compiler2.enabled & vm.debug == true >> >> Why this need `@requires vm.flagless`. Do other VM flags will cause this test fails. > > Sorry, the documemt from `createLimitedTestJavaProcessBuilder` say this function use with `@requires vm.flagless` Right and it simply does not make sense to run this test with any other flags because it will spawn it's own VM. Thanks for looking at this! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25751#discussion_r2144408698 From thartmann at openjdk.org Fri Jun 13 07:47:35 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 13 Jun 2025 07:47:35 GMT Subject: Integrated: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 11:49:08 GMT, Tobias Hartmann wrote: > I found this by accident when running testing with non-default `-XX:OptoNodeListSize` (see JBS for details). The problem is that `MStack::push` assumes that `Node_Stack::grow` will always grow the stack by at least two (line 69) and it then proceeds to put two items in: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/matcher.hpp#L67-L74 > > But after [JDK-8336999](https://bugs.openjdk.org/browse/JDK-8336999), `Node_Stack::grow` will only grow the stack if needed: > https://github.com/openjdk/jdk/blob/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a/src/hotspot/share/opto/node.cpp#L3027-L3031 > > However, if there's **one** empty slot, the stack will not be grown and `MStack::push` then puts **two** items on the stack which leads to memory corruption. > > I refactored the push method to delegate to `Node_Stack::push` which does the right thing and, similar to [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), also added `maybe_grow` methods for all the containers affected by the original change. For additional coverage, I moved the `_nesting.check` calls to before the check that determines if we grow. > > I only ever observed this with a non-default and odd value for `-XX:OptoNodeListSize` but I'm not 100% convinced that it can't happen with the default value, so I'm treating this as P2 and will backport the fix to JDK 25. > > @shipilev Since you worked on [JDK-8343056](https://bugs.openjdk.org/browse/JDK-8343056), could you please take a look at this? > > Thanks, > Tobias This pull request has now been integrated. Changeset: ed39e17e Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a Stats: 94 lines in 8 files changed: 73 ins; 14 del; 7 mod 8359200: Memory corruption in MStack::push Reviewed-by: shade, kvn ------------- PR: https://git.openjdk.org/jdk/pull/25751 From dnsimon at openjdk.org Fri Jun 13 08:03:33 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 13 Jun 2025 08:03:33 GMT Subject: Integrated: 8359293: Make TestNoNULL extensible In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 09:49:14 GMT, Doug Simon wrote: > There are some closed markdown files that contain "NULL". These should be excluded from checking by TestNoNULL. This PR makes TestNoNull extensible in terms of exclusions with the following system properties: > > * `excludedTestExtensions` - command separated list of file extensions (e.g. `-DexcludedTestExtensions=.md,.txt`) > * `sourceExclusions` - command separated list of file paths relative to repo root (e.g. `-DsourceExclusions=src/hotspot/share/prims/jvmti.xml`) > * `testExclusions` - command separated list of file paths relative to repo root (e.g. `-DtestExclusions=test/hotspot/jtreg/vmTestbase/nsk/share/jvmti/README`) This pull request has now been integrated. Changeset: a8b42848 Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/a8b42848489747f869e33a5067fdda91553eec96 Stats: 19 lines in 1 file changed: 14 ins; 0 del; 5 mod 8359293: Make TestNoNULL extensible Reviewed-by: kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/25777 From dnsimon at openjdk.org Fri Jun 13 08:03:32 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 13 Jun 2025 08:03:32 GMT Subject: RFR: 8359293: Make TestNoNULL extensible In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 09:49:14 GMT, Doug Simon wrote: > There are some closed markdown files that contain "NULL". These should be excluded from checking by TestNoNULL. This PR makes TestNoNull extensible in terms of exclusions with the following system properties: > > * `excludedTestExtensions` - command separated list of file extensions (e.g. `-DexcludedTestExtensions=.md,.txt`) > * `sourceExclusions` - command separated list of file paths relative to repo root (e.g. `-DsourceExclusions=src/hotspot/share/prims/jvmti.xml`) > * `testExclusions` - command separated list of file paths relative to repo root (e.g. `-DtestExclusions=test/hotspot/jtreg/vmTestbase/nsk/share/jvmti/README`) Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25777#issuecomment-2969454924 From thartmann at openjdk.org Fri Jun 13 08:08:10 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 13 Jun 2025 08:08:10 GMT Subject: [jdk25] RFR: 8359200: Memory corruption in MStack::push Message-ID: Hi all, This pull request contains a backport of commit [ed39e17e](https://github.com/openjdk/jdk/commit/ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Tobias Hartmann on 13 Jun 2025 and was reviewed by Aleksey Shipilev and Vladimir Kozlov. Thanks! ------------- Commit messages: - Backport ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a Changes: https://git.openjdk.org/jdk/pull/25792/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25792&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359200 Stats: 94 lines in 8 files changed: 73 ins; 14 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/25792.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25792/head:pull/25792 PR: https://git.openjdk.org/jdk/pull/25792 From epeter at openjdk.org Fri Jun 13 08:14:27 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 13 Jun 2025 08:14:27 GMT Subject: [jdk25] RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 08:02:06 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed39e17e](https://github.com/openjdk/jdk/commit/ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jun 2025 and was reviewed by Aleksey Shipilev and Vladimir Kozlov. > > Thanks! Thanks for the backport, LGTM :) ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25792#pullrequestreview-2923802328 From thartmann at openjdk.org Fri Jun 13 08:17:28 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 13 Jun 2025 08:17:28 GMT Subject: [jdk25] RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 08:02:06 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed39e17e](https://github.com/openjdk/jdk/commit/ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jun 2025 and was reviewed by Aleksey Shipilev and Vladimir Kozlov. > > Thanks! Thanks for the quick review Emanuel! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25792#issuecomment-2969490792 From mhaessig at openjdk.org Fri Jun 13 09:01:36 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 13 Jun 2025 09:01:36 GMT Subject: RFR: JDK-8348574 : Simplify c1/c2_globals inclusions In-Reply-To: References: Message-ID: <7iAduRwCTEhdlLrawIwYXAM4k5l5PRY7090XOYJeheE=.c279c29b-8316-4551-852c-d44bb8fe058a@github.com> On Thu, 12 Jun 2025 08:32:56 GMT, Suchismith Roy wrote: > JBS Issue : [JDK-8348574](https://bugs.openjdk.org/browse/JDK-8348574) > > c1_globals.hpp includes c1_globals_pd.hpp. c1_globals_pd.hpp includes the corresponding CPU_HEADER and OS_HEADER files. All of the c1_globals_.hpp files are essentially identical and basically empty. (They just include globalDefinitions.hpp and macros.hpp, and provide nothing additional.) > > This could be simplified by having c1_globals.hpp do the CPU_HEADER inclusion directly, and remove c1_globals_pd.hpp and all c1_globals_.hpp files. > > Even if there are some non-vacuous c1_globals_.hpp files in the future, c1_globals_pd.hpp seems unwarranted; just add the OS_HEADER include directly in c1_globals.hpp. The c1_globals_pd.hpp files really don't seem worth the extra indirection. > > Similarly for c2_globals.hpp &etc. Testing passed. Ship it! ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/25773#pullrequestreview-2923939477 From duke at openjdk.org Fri Jun 13 09:17:30 2025 From: duke at openjdk.org (duke) Date: Fri, 13 Jun 2025 09:17:30 GMT Subject: RFR: JDK-8348574 : Simplify c1/c2_globals inclusions In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:32:56 GMT, Suchismith Roy wrote: > JBS Issue : [JDK-8348574](https://bugs.openjdk.org/browse/JDK-8348574) > > c1_globals.hpp includes c1_globals_pd.hpp. c1_globals_pd.hpp includes the corresponding CPU_HEADER and OS_HEADER files. All of the c1_globals_.hpp files are essentially identical and basically empty. (They just include globalDefinitions.hpp and macros.hpp, and provide nothing additional.) > > This could be simplified by having c1_globals.hpp do the CPU_HEADER inclusion directly, and remove c1_globals_pd.hpp and all c1_globals_.hpp files. > > Even if there are some non-vacuous c1_globals_.hpp files in the future, c1_globals_pd.hpp seems unwarranted; just add the OS_HEADER include directly in c1_globals.hpp. The c1_globals_pd.hpp files really don't seem worth the extra indirection. > > Similarly for c2_globals.hpp &etc. @suchismith1993 Your change (at version 993ee2632f0e7be6a6eb40d22a98c4e8d5da29e2) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25773#issuecomment-2969667508 From shade at openjdk.org Fri Jun 13 09:25:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 13 Jun 2025 09:25:29 GMT Subject: [jdk25] RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 08:02:06 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed39e17e](https://github.com/openjdk/jdk/commit/ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jun 2025 and was reviewed by Aleksey Shipilev and Vladimir Kozlov. > > Thanks! Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25792#pullrequestreview-2924011208 From thartmann at openjdk.org Fri Jun 13 09:37:28 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 13 Jun 2025 09:37:28 GMT Subject: [jdk25] RFR: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 08:02:06 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed39e17e](https://github.com/openjdk/jdk/commit/ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jun 2025 and was reviewed by Aleksey Shipilev and Vladimir Kozlov. > > Thanks! Thanks Aleksey! I'm integrating this once Github actions testing passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25792#issuecomment-2969739224 From thartmann at openjdk.org Fri Jun 13 09:50:43 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 13 Jun 2025 09:50:43 GMT Subject: [jdk25] RFR: 8359327: Incorrect AVX3Threshold results into code buffer overflows on APX targets Message-ID: <9WHKq0JzPpVQmht3CYSh-l_PPX40wqCW5enTK-jol00=.ad7f76d7-4ed2-4495-aba1-649f8100349e@github.com> Hi all, This pull request contains a backport of commit [e7f63ba3](https://github.com/openjdk/jdk/commit/e7f63ba3109adf614cee1bc392cfeef85e9ca778) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Jatin Bhateja on 13 Jun 2025 and was reviewed by Sandhya Viswanathan. Thanks! ------------- Commit messages: - Backport e7f63ba3109adf614cee1bc392cfeef85e9ca778 Changes: https://git.openjdk.org/jdk/pull/25796/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25796&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359327 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25796.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25796/head:pull/25796 PR: https://git.openjdk.org/jdk/pull/25796 From duke at openjdk.org Fri Jun 13 09:56:49 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 13 Jun 2025 09:56:49 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v10] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static MemRes available_memory(); > static julong used_memory(); --> static MemRes used_memory(); > static julong free_memory(); --> static MemRes free_memory(); > static jlong total_swap_space(); --> static MemRes total_swap_space(); > static jlong free_swap_space(); --> static MemRes free_swap_space(); > static julong physical_memory(); --> static MemRes physical_memory(); > > > `MemRes` is a struct containing a pair of values, `size_t val` to carry the return value, `int err` to carry the error if any. Currently, in case of error the latter is set to -1. > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8357086: Addressed the reviewers comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/dd9275ca..ebf21bce Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=08-09 Stats: 140 lines in 16 files changed: 11 ins; 54 del; 75 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Fri Jun 13 09:56:52 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 13 Jun 2025 09:56:52 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 18:15:33 GMT, Kim Barrett wrote: >> Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: >> >> - 8357086: Fixed merge conflict. >> - 8357086: Changed returm type to struct. >> - 8357086: Return size_t from swap mem funcs, added checks. >> - 8357086: Added missed casts. >> - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t >> - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs >> - 8357086: Fixed spaces in formatting in gc-related code. >> - 8357086: Fixed formatting. >> - 8357086: Addressed reviewer's comments. >> - 8357086: More work. >> - ... and 7 more: https://git.openjdk.org/jdk/compare/e18277b4...dd9275ca > > src/hotspot/share/runtime/os.hpp line 151: > >> 149: struct MemRes { >> 150: size_t val; >> 151: int err; > > I'm not a big fan of abbreviations like this. My preference would be to at least spell out "value". > Don't know what others might think. I changed the variables names to full words for better readability, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2144660606 From duke at openjdk.org Fri Jun 13 10:14:22 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 13 Jun 2025 10:14:22 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v11] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static MemRes available_memory(); > static julong used_memory(); --> static MemRes used_memory(); > static julong free_memory(); --> static MemRes free_memory(); > static jlong total_swap_space(); --> static MemRes total_swap_space(); > static jlong free_swap_space(); --> static MemRes free_swap_space(); > static julong physical_memory(); --> static MemRes physical_memory(); > > > `MemRes` is a struct containing a pair of values, `size_t val` to carry the return value, `int err` to carry the error if any. Currently, in case of error the latter is set to -1. > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8357086: Added default val to constructor of MemRes. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25450/files - new: https://git.openjdk.org/jdk/pull/25450/files/ebf21bce..12055345 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=09-10 Stats: 25 lines in 6 files changed: 0 ins; 0 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From duke at openjdk.org Fri Jun 13 10:14:27 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 13 Jun 2025 10:14:27 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 18:07:35 GMT, Kim Barrett wrote: >> Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: >> >> - 8357086: Fixed merge conflict. >> - 8357086: Changed returm type to struct. >> - 8357086: Return size_t from swap mem funcs, added checks. >> - 8357086: Added missed casts. >> - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t >> - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs >> - 8357086: Fixed spaces in formatting in gc-related code. >> - 8357086: Fixed formatting. >> - 8357086: Addressed reviewer's comments. >> - 8357086: More work. >> - ... and 7 more: https://git.openjdk.org/jdk/compare/e18277b4...dd9275ca > > src/hotspot/share/runtime/os.cpp line 2217: > >> 2215: #endif >> 2216: res.val = os::physical_memory().val - os::available_memory().val; >> 2217: res.err = MIN2(os::physical_memory().err, os::available_memory().err); > > Repeated calls shouldn't be assumed to return the same error status. > And min seems like the wrong way to combine errors. I think this should get the result from > `os::physical_memory` and inspect it for an error. If not an error then do the same for > `os::available_memory`. If not an error then compute the result accordingly. Thanks, I did not think about it that way. I addressed the problem by sequential checking of errors. > src/hotspot/share/runtime/os.hpp line 152: > >> 150: size_t val; >> 151: int err; >> 152: MemRes(size_t v, int e) : val(v), err(e) {} > > It might be that having the second argument be optional and default to 0 is nicer / more convenient. Good suggestion, in most cases we don't need the 2nd one, I added a default value. Thanks. > src/hotspot/share/runtime/os.hpp line 153: > >> 151: int err; >> 152: MemRes(size_t v, int e) : val(v), err(e) {} >> 153: MemRes(): val(0), err(0) {} > > Do we really need initialization in the default ctor? And if so, is default initialization to not-an-error > what we want? I removed the default constructor, as decided to construct in the return statement with given argument(s). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2144698403 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2144694805 PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2144696837 From duke at openjdk.org Fri Jun 13 10:16:36 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 13 Jun 2025 10:16:36 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 17:50:58 GMT, Kim Barrett wrote: > About the MemRes change. I think adding the MemRes instance and then initializing it later makes the code messier: I generally agree that it is more convenient to construct in the return statement. However, to be consistent with the rest of the codebase, I decided to use a call the constructor (not just the braces version, though it also works). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2969845459 From duke at openjdk.org Fri Jun 13 10:25:40 2025 From: duke at openjdk.org (duke) Date: Fri, 13 Jun 2025 10:25:40 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 15:30:48 GMT, Mikhail Ablakatov wrote: >> In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. >> >> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. >> >> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: >> >> | Metric | Before | After | Difference | >> |-------------|---------------|---------------|------------| >> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | >> | | Sum: 6653848 | Sum: 6616344 | -0.56% | >> | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | >> | | Sum: 364376 | Sum: 308552 | -15.33% | >> >> Full jtreg passed on AArch64. > > Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: > > cleanup: update a copyright notice > > Co-authored-by: Andrew Haley @mikabl-arm Your change (at version 2eae70efc051477d145ef2cceae1aa19fc5b8bf7) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25702#issuecomment-2969869916 From rvansa at openjdk.org Fri Jun 13 10:28:46 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Fri, 13 Jun 2025 10:28:46 GMT Subject: RFR: 8352075: Perf regression accessing fields [v30] In-Reply-To: <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> <3vrqqpyldDrHFM_usZFBAnRMq00ftXuPaPRqjONzDMc=.f850ebdf-c4e5-44a0-a160-5b52505b05d4@github.com> Message-ID: On Tue, 10 Jun 2025 19:44:03 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Copyright update Thanks to everyone involved! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2969876715 From mdoerr at openjdk.org Fri Jun 13 11:57:35 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 13 Jun 2025 11:57:35 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v3] In-Reply-To: References: Message-ID: <0jwlIB4VebRh3dORUSPdbmmE4YWS1Z6t_Ax8ni0_tpg=.f9dd6d19-1c97-43d7-acb2-ad575b61d433@github.com> On Thu, 12 Jun 2025 19:54:45 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with six additional commits since the last revision: > > - more cleanup > - more TheRealMDoerr suggestions > - TheRealMDoerr suggestions > - remove trailing space > - Fix Shenandoah build > - more cleanup LGTM, now. I think `is_sigill_not_entrant` should also be removed completely: https://github.com/TheRealMDoerr/jdk/commit/f61f54a40ad0b952ba8c0a668675b3f315f83a95 I've run tests on some platforms, but will run on more. @offamitkumar, @RealFYang, @bulasevich: You may also want to check your platforms. @fisk: You may want to take a look at the nmethod entry barrier changes. ------------- PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2924552667 From mhaessig at openjdk.org Fri Jun 13 12:54:35 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 13 Jun 2025 12:54:35 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t In-Reply-To: References: Message-ID: <3sJTUKOONMnjRHYKl-M6Dcx4ZDrbbOYPD5DTbgp7UuI=.0d870e8c-fe60-48ed-9f54-aab2786fe77f@github.com> On Fri, 13 Jun 2025 06:57:20 GMT, Kim Barrett wrote: > Please review this change that makes the various code cache/heap size options > consistently be of type size_t. > > The shared declarations for these options were all uintx. These options all > may have platform-defined values. Some of those platform-specific definitions > were uintx, some were size_t, and some were intx(!). This change makes them > all consistently size_t. > > More details in the first comment. > > Testing: mach5 tier1-6 > GHA testing in-progress Thank you for working on this, @kimbarrett. Good to see more consistent types here. The changes overall look good to me. I only have a question and some minor suggestions. >One change in particular to note is the change in compilationPolicy.cpp. The calculation of max_count, depending on explicit option values and such, could potentially overflow its prior int type, effectively having a random value. There is still a possibility of a nonsense result if ReservedCodeCacheSize and CodeCacheMinimumUseSpace are poorly chosen. I'm leaving that pre-existing issue to the compiler team to deal with. Thank you for pointing this out. I'm currently looking at that part of the code in #25770 and will fix it there. src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 61: > 59: > 60: #include > 61: Is this related to this change? I cannot find any usages from this header in the diff. src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 550: > 548: #define ADD_SIZE_T_FLAG(name) ADD_FLAG(size_t, name, BOXED_LONG) > 549: #define ADD_INTX_FLAG(name) ADD_FLAG(intx, name, BOXED_LONG) > 550: #define ADD_UINTX_FLAG(name) ADD_FLAG(uintx, name, BOXED_LONG) Suggestion: #define ADD_BOOL_FLAG(name) ADD_FLAG(bool, name, BOXED_BOOLEAN) #define ADD_INT_FLAG(name) ADD_FLAG(int, name, BOXED_LONG) #define ADD_SIZE_T_FLAG(name) ADD_FLAG(size_t, name, BOXED_LONG) #define ADD_INTX_FLAG(name) ADD_FLAG(intx, name, BOXED_LONG) #define ADD_UINTX_FLAG(name) ADD_FLAG(uintx, name, BOXED_LONG) Feel free to ignore, but since you are already touching this, we might as well align it. src/hotspot/share/runtime/arguments.cpp line 2451: > 2449: return JNI_EINVAL; > 2450: } > 2451: if (FLAG_SET_CMDLINE(ReservedCodeCacheSize, (size_t)long_ReservedCodeCacheSize) != JVMFlag::SUCCESS) { Is this cast correct on a 32-bit platform, where `size_t` is not the same as `uint64_t`? ------------- PR Review: https://git.openjdk.org/jdk/pull/25791#pullrequestreview-2924643009 PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2144994525 PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2144997094 PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2145007538 From dcubed at openjdk.org Fri Jun 13 16:14:46 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 13 Jun 2025 16:14:46 GMT Subject: RFR: JDK-8348574 : Simplify c1/c2_globals inclusions In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:32:56 GMT, Suchismith Roy wrote: > JBS Issue : [JDK-8348574](https://bugs.openjdk.org/browse/JDK-8348574) > > c1_globals.hpp includes c1_globals_pd.hpp. c1_globals_pd.hpp includes the corresponding CPU_HEADER and OS_HEADER files. All of the c1_globals_.hpp files are essentially identical and basically empty. (They just include globalDefinitions.hpp and macros.hpp, and provide nothing additional.) > > This could be simplified by having c1_globals.hpp do the CPU_HEADER inclusion directly, and remove c1_globals_pd.hpp and all c1_globals_.hpp files. > > Even if there are some non-vacuous c1_globals_.hpp files in the future, c1_globals_pd.hpp seems unwarranted; just add the OS_HEADER include directly in c1_globals.hpp. The c1_globals_pd.hpp files really don't seem worth the extra indirection. > > Similarly for c2_globals.hpp &etc. (Gotta love it when you typo a lable^H^H^H^H^H label). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25773#issuecomment-2970862022 From kvn at openjdk.org Fri Jun 13 16:26:38 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 13 Jun 2025 16:26:38 GMT Subject: RFR: 8359373: Split stubgen initial blob into pre and post-universe blobs [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 18:01:09 GMT, Andrew Dinn wrote: >> This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. > > Andrew Dinn has updated the pull request incrementally with one additional commit since the last revision: > > ensure fence stub is set as part of preuniverse init on zero Testing passed clean. You can push. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25784#issuecomment-2970891756 From yzheng at openjdk.org Fri Jun 13 16:29:17 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Fri, 13 Jun 2025 16:29:17 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v7] In-Reply-To: References: Message-ID: > Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods. Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge tag 'jdk-26+2' into JDK-8357424 Added tag jdk-26+2 for changeset d7aa3498 - fix compilation error - address comments - Merge remote-tracking branch 'upstream/master' into JDK-8357424 - address comments - address comments - update copyright - [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25356/files - new: https://git.openjdk.org/jdk/pull/25356/files/9d24428e..f253c0a8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=05-06 Stats: 3428 lines in 81 files changed: 1704 ins; 1413 del; 311 mod Patch: https://git.openjdk.org/jdk/pull/25356.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356 PR: https://git.openjdk.org/jdk/pull/25356 From adinn at openjdk.org Fri Jun 13 16:56:32 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 13 Jun 2025 16:56:32 GMT Subject: Integrated: 8359373: Split stubgen initial blob into pre and post-universe blobs In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 17:01:17 GMT, Andrew Dinn wrote: > This PR adds a new preuniverse blob to the stubgen blobs set and relocates the initial fence stub to that blob. This pull request has now been integrated. Changeset: ee35f638 Author: Andrew Dinn URL: https://git.openjdk.org/jdk/commit/ee35f6384fdd0783a7ae62508e837a66683cdd3c Stats: 149 lines in 18 files changed: 137 ins; 6 del; 6 mod 8359373: Split stubgen initial blob into pre and post-universe blobs Reviewed-by: kvn ------------- PR: https://git.openjdk.org/jdk/pull/25784 From kbarrett at openjdk.org Fri Jun 13 18:46:58 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 13 Jun 2025 18:46:58 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: References: Message-ID: > Please review this change that makes the various code cache/heap size options > consistently be of type size_t. > > The shared declarations for these options were all uintx. These options all > may have platform-defined values. Some of those platform-specific definitions > were uintx, some were size_t, and some were intx(!). This change makes them > all consistently size_t. > > More details in the first comment. > > Testing: mach5 tier1-6 > GHA testing in-progress Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: - update copyrights - remove leftover include ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25791/files - new: https://git.openjdk.org/jdk/pull/25791/files/9b96ebd1..bd1923f6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25791&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25791&range=00-01 Stats: 29 lines in 28 files changed: 0 ins; 2 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/25791.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25791/head:pull/25791 PR: https://git.openjdk.org/jdk/pull/25791 From dlong at openjdk.org Fri Jun 13 19:18:19 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 13 Jun 2025 19:18:19 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request incrementally with one additional commit since the last revision: remove is_sigill_not_entrant ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25764/files - new: https://git.openjdk.org/jdk/pull/25764/files/0950fc5e..c1ebde09 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=02-03 Stats: 23 lines in 6 files changed: 0 ins; 22 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From dcubed at openjdk.org Fri Jun 13 19:33:38 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 13 Jun 2025 19:33:38 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory In-Reply-To: References: Message-ID: On Fri, 16 May 2025 12:18:42 GMT, Coleen Phillimore wrote: > JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about performance going forward. s/concerned about performance/concerned about JNI performance/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2971428640 From dcubed at openjdk.org Fri Jun 13 19:39:56 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 13 Jun 2025 19:39:56 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory In-Reply-To: References: Message-ID: On Fri, 16 May 2025 12:18:42 GMT, Coleen Phillimore wrote: > In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. Hmmm... I'm rusty with redefinition, but I think there are legitimate scenarios where is "okay" to call an obsolete method. I believe it has to do with an in-progress call to a method being redefined and you might get the obsolete method or the new method depending on timing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2971451618 From coleenp at openjdk.org Fri Jun 13 20:39:33 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 13 Jun 2025 20:39:33 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory In-Reply-To: References: Message-ID: <8hFd0ec-RQ2ZmA1hvUg6Oa7XBQtUL6SCcfvVQEFeQLo=.7aded867-efa5-4818-aed1-f6f1c0daa455@github.com> On Fri, 16 May 2025 12:18:42 GMT, Coleen Phillimore wrote: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. There is a way to create a jmethodID from an obsolete method and I thought we had tests that did this. I'm not finding them right now. The old and obsolete methods can still be running after a redefinition if they were running during the redefinition. The JIT deoptimizes them but they will be run in the interpreter. New invocations of the method will choose the new version of the method always. Always unless there's a bug that we don't know about. We have fixed a few old method invocations in the past coming from various places in the JVM but we fixed the "last" one fairly recently. But technically, one could create a jmethodID to an obsolete method although it's not easy to do. I was musing above but not committed to breaking compatibility if there is code that requires this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2971620203 From dcubed at openjdk.org Fri Jun 13 20:44:43 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 13 Jun 2025 20:44:43 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory In-Reply-To: References: Message-ID: On Fri, 16 May 2025 12:18:42 GMT, Coleen Phillimore wrote: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Changes requested by dcubed (Reviewer). src/hotspot/share/classfile/classLoaderData.cpp line 585: > 583: MutexLocker m1(metaspace_lock(), Mutex::_no_safepoint_check_flag); > 584: if (_jmethod_ids == nullptr) { > 585: _jmethod_ids = new (mtClass) GrowableArray(32, mtClass); Do you want the literal `32` to be a tunable value? src/hotspot/share/classfile/classLoaderData.cpp line 590: > 588: } > 589: > 590: // Method::clear_jmethod_ids removes jmethodID entries from the table which Perhaps the name changed during your development? s/clear_jmethod_ids/remove_jmethod_ids/ src/hotspot/share/classfile/classLoaderData.cpp line 592: > 590: // Method::clear_jmethod_ids removes jmethodID entries from the table which > 591: // releases memory. > 592: // Because native code (e.g. JVMTI agent) holding jmethod_ids may access them grammar: s/e.g./e.g.,/ src/hotspot/share/classfile/classLoaderData.cpp line 594: > 592: // Because native code (e.g. JVMTI agent) holding jmethod_ids may access them > 593: // after the associated classes and class loader are unloaded, subsequent lookups > 594: // for these ids will return null since they are no longer found in the table. Perhaps: s/null/nullptr/ src/hotspot/share/classfile/classLoaderData.hpp line 319: > 317: void add_jmethod_id(jmethodID id); > 318: void remove_jmethod_ids(); > 319: GrowableArray* jmethod_ids() { return _jmethod_ids; } Should `jmethod_ids` still be `const`? src/hotspot/share/oops/instanceKlass.cpp line 2394: > 2392: } > 2393: > 2394: // Lookup or create a jmethodID. The comment on L2394 appears wrong for `create_jmethod_id_cache`. Perhaps move it to L2404 (above get_jmethod_id() function). src/hotspot/share/oops/instanceKlass.cpp line 2397: > 2395: static jmethodID* create_jmethod_id_cache(size_t size) { > 2396: jmethodID* jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); > 2397: memset(jmeths, 0, (size+1)*sizeof(jmethodID)); nit spacing: s/size+1/size + 1/ on two lines. src/hotspot/share/oops/instanceKlass.cpp line 2402: > 2400: return jmeths; > 2401: } > 2402: nit spacing: delete extra blank line? src/hotspot/share/oops/instanceKlass.cpp line 2404: > 2402: > 2403: > 2404: jmethodID InstanceKlass::get_jmethod_id(Method* method) { Should `method` be `const`? src/hotspot/share/oops/instanceKlass.cpp line 2460: > 2458: size_t size = idnum_allocated_count(); > 2459: size_t old_size = (size_t)cache[0]; > 2460: if (old_size < size+1) { nit spacing: s/size+1/size + 1/ src/hotspot/share/oops/instanceKlass.cpp line 2461: > 2459: size_t old_size = (size_t)cache[0]; > 2460: if (old_size < size+1) { > 2461: // allocate a larger one and copy entries to the new one. nit typo: s/allocate/Allocate/ src/hotspot/share/oops/instanceKlass.cpp line 2462: > 2460: if (old_size < size+1) { > 2461: // allocate a larger one and copy entries to the new one. > 2462: // They've already been updated to point to new methods where applicable (ie. not obsolete) nit typo: s/(ie. not obsolete)/(i.e., not obsolete)./ src/hotspot/share/oops/instanceKlass.cpp line 2495: > 2493: id == nullptr) { > 2494: id = Method::make_jmethod_id(class_loader_data(), m); > 2495: Atomic::release_store(&jmeths[idnum+1], id); nit spacing: s/size+1/size + 1/ src/hotspot/share/oops/instanceKlass.hpp line 1057: > 1055: inline jmethodID* methods_jmethod_ids_acquire() const; > 1056: inline void release_set_methods_jmethod_ids(jmethodID* jmeths); > 1057: // This nulls out obsolete jmethodIDs for all methods in 'klass' nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.cpp line 29: > 27: #include "memory/resourceArea.hpp" > 28: #include "oops/method.hpp" > 29: #include "oops/jmethodIDTable.hpp" Please swap these two #includes into sort order. src/hotspot/share/oops/jmethodIDTable.cpp line 35: > 33: #include "utilities/macros.hpp" > 34: > 35: // Save (jmethod, Method*) in a hashtable to lookup Method nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.cpp line 36: > 34: > 35: // Save (jmethod, Method*) in a hashtable to lookup Method > 36: // The CHT is for performance because it is has lock free lookup. typo: s/because it is has lock free/because it has lock free/ src/hotspot/share/oops/jmethodIDTable.cpp line 73: > 71: // 2^24 is max size > 72: const size_t end_size = 24; > 73: // If a chain gets to 32 something might be wrong nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.cpp line 98: > 96: > 97: static JmethodEntry* get_jmethod_entry(jmethodID mid) { > 98: assert(mid != nullptr, "JNI method id should not be null"); Perhaps: s/null/nullptr/ I can't remember if assert failure text output is okay to be `null`. src/hotspot/share/oops/jmethodIDTable.cpp line 104: > 102: JmethodEntry* result = nullptr; > 103: auto get = [&] (JmethodEntry* value) { > 104: // function called if value is found so is never null Perhaps: s/null/nullptr/ src/hotspot/share/oops/jmethodIDTable.cpp line 109: > 107: bool needs_rehashing = false; > 108: _jmethod_id_table->get(current, lookup, get, &needs_rehashing); > 109: assert (!needs_rehashing, "should never need rehashing"); nit spacing: s/assert (/assert(/ src/hotspot/share/oops/jmethodIDTable.cpp line 129: > 127: } > 128: > 129: // Add a method id to the jmethod_ids nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.cpp line 131: > 129: // Add a method id to the jmethod_ids > 130: jmethodID JmethodIDTable::make_jmethod_id(Method* m) { > 131: bool grow_hint, clean_hint, created; nit: sort local variables? src/hotspot/share/oops/jmethodIDTable.cpp line 144: > 142: log_debug(jmethod)("Inserted jmethod id " UINT64_FORMAT_X, _jmethodID_counter); > 143: > 144: // Resize table if it needs to grow. The _jmethod_id_table has a good distribution nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.cpp line 158: > 156: JmethodEntry* result; > 157: auto get = [&] (JmethodEntry* value) { > 158: // function called if value is found so is never null Perhaps: s/null/nullptr/ src/hotspot/share/oops/jmethodIDTable.cpp line 162: > 160: }; > 161: bool removed = _jmethod_id_table->remove(current, lookup, get); > 162: assert(removed, "should be"); "should be" or "must be"? src/hotspot/share/oops/jmethodIDTable.cpp line 169: > 167: assert_locked_or_safepoint(JmethodIdCreation_lock); > 168: JmethodEntry* result = get_jmethod_entry(jmid); > 169: // change to table to point to the new method Perhaps: // Change to table entry to point to the new method. src/hotspot/share/oops/jmethodIDTable.cpp line 180: > 178: // We need to make sure that jmethodID actually resolves to this method > 179: // - multiple redefined versions may share jmethodID slots and if a method > 180: // has already been rewired to a newer version we could be removing reference typo?: s/could be removing reference/could be clearing a reference/ src/hotspot/share/oops/jmethodIDTable.hpp line 31: > 29: #include "memory/allocation.hpp" > 30: > 31: // Class for associating Method with jmethodID nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.hpp line 38: > 36: static void initialize(); > 37: > 38: // Given a Method return a jmethodID nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.hpp line 41: > 39: static jmethodID make_jmethod_id(Method* m); > 40: > 41: // Given a jmethodID, return a Method nit typo: please add an ending period to the comment. src/hotspot/share/oops/jmethodIDTable.hpp line 45: > 43: > 44: // Class unloading support, remove the associations from the tables. Stale jmethodID will > 45: // not be found and return null nit typo: please add an ending period to the comment. Perhaps: s/null/nullptr/ src/hotspot/share/oops/method.cpp line 2063: > 2061: > 2062: // jmethodID handling > 2063: // jmethodIDs are 64-bit integers that will never run out and are mapped in a table Should we have a `guarantee` or `assert` somewhere that the counter never wraps? src/hotspot/share/oops/method.cpp line 2065: > 2063: // jmethodIDs are 64-bit integers that will never run out and are mapped in a table > 2064: // to their Method and vice versa. If JNI code has access to stale jmethodID, this > 2065: // wastes no memory but the Method* returned is null Perhaps: s/null/nullptr/ src/hotspot/share/oops/method.cpp line 2076: > 2074: assert(jmid != nullptr, "must be created"); > 2075: > 2076: // Add to growable array in CLD nit typo: please add an ending period to the comment. src/hotspot/share/oops/method.cpp line 2088: > 2086: // Get the Method out of the table given the method id. > 2087: Method* Method::resolve_jmethod_id(jmethodID mid) { > 2088: assert(mid != nullptr, "JNI method id should not be null"); Perhaps: s/null/nullptr/ src/hotspot/share/oops/method.hpp line 709: > 707: // Use resolve_jmethod_id() in situations where the caller is expected > 708: // to provide a valid jmethodID; the only sanity checks are in asserts; > 709: // result guaranteed not to be null. Perhaps: s/null/nullptr/ src/hotspot/share/oops/method.hpp line 713: > 711: > 712: // Use checked_resolve_jmethod_id() in situations where the caller > 713: // should provide a valid jmethodID, but might not. Null is returned Perhaps: s/Null/Nullptr/ src/hotspot/share/prims/jvmtiEnv.cpp line 2773: > 2771: int skipped = 0; // skip overpass methods > 2772: > 2773: // Make jmethodIDs for all non-overpass methods nit typo: please add an ending period to the comment. src/hotspot/share/prims/jvmtiEnv.cpp line 2789: > 2787: } > 2788: jmethodID id = m->find_jmethod_id_or_null(); > 2789: assert (id != nullptr, "should be created above"); nit spacing: s/assert (/assert(/ src/hotspot/share/runtime/mutexLocker.cpp line 236: > 234: MUTEX_DEFN(Notification_lock , PaddedMonitor, service); // used for notification thread operations > 235: > 236: MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs. Interesting. Why change from `nosafepoint-2` to `nosafepoint-1`? ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2926153300 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145955175 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145963559 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145960811 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145967055 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145984911 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145989746 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145993097 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145994055 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146005043 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145996873 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145997702 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2145998880 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146002943 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146006426 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146007219 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146007725 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146008289 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146011312 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146013418 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146029756 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146014609 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146018654 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146016370 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146017327 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146029142 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146019950 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146022181 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146024577 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146025893 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146026406 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146026860 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146029258 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146037800 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146031309 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146039166 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146040503 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146045442 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146047075 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146049138 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146050371 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2146052800 From dnsimon at openjdk.org Fri Jun 13 21:01:59 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 13 Jun 2025 21:01:59 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v7] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 16:29:17 GMT, Yudi Zheng wrote: >> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods. > > Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge tag 'jdk-26+2' into JDK-8357424 > > Added tag jdk-26+2 for changeset d7aa3498 > - fix compilation error > - address comments > - Merge remote-tracking branch 'upstream/master' into JDK-8357424 > - address comments > - address comments > - update copyright > - [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod src/hotspot/share/code/nmethod.cpp line 1951: > 1949: // Could be gated by ProfileTraps, but do not bother... > 1950: #if INCLUDE_JVMCI > 1951: if (is_jvmci_hosted()) { Someone (like me!) is going to see this code a while from now and try remember why the decompilation count is not being decremented for JVMCI hosted nmethods. I think it's worth adding a comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25356#discussion_r2146084403 From kbarrett at openjdk.org Fri Jun 13 21:14:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 13 Jun 2025 21:14:35 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: <3sJTUKOONMnjRHYKl-M6Dcx4ZDrbbOYPD5DTbgp7UuI=.0d870e8c-fe60-48ed-9f54-aab2786fe77f@github.com> References: <3sJTUKOONMnjRHYKl-M6Dcx4ZDrbbOYPD5DTbgp7UuI=.0d870e8c-fe60-48ed-9f54-aab2786fe77f@github.com> Message-ID: <2RxFDNlbqvjX0IGueBW0KIK0J6nRII4D0nIc7TaI1AM=.e2ccb3ec-d0e0-4a05-bac2-9450ff69b539@github.com> On Fri, 13 Jun 2025 12:29:59 GMT, Manuel H?ssig wrote: >> Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: >> >> - update copyrights >> - remove leftover include > > src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 61: > >> 59: >> 60: #include >> 61: > > Is this related to this change? I cannot find any usages from this header in the diff. Oops. Leftover from an abandoned idea. Removed now. > src/hotspot/share/runtime/arguments.cpp line 2451: > >> 2449: return JNI_EINVAL; >> 2450: } >> 2451: if (FLAG_SET_CMDLINE(ReservedCodeCacheSize, (size_t)long_ReservedCodeCacheSize) != JVMFlag::SUCCESS) { > > Is this cast correct on a 32-bit platform, where `size_t` is not the same as `uint64_t`? This isn't any different from the previous code, as uintx is also a 32/64 bit type on 32/64 bit platforms. And it's fine, as the default for the maximum value for parse_memory_size is max_uintx. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2146097179 PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2146097889 From kvn at openjdk.org Sat Jun 14 00:11:49 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 14 Jun 2025 00:11:49 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early [v2] In-Reply-To: References: Message-ID: > Thanks to @shipilev for catching the issue. > > [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. > > This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. > > We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). > > We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. > > The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. > > I also did some cleanup to match `leyden/premain` branch for easy merges. > > Tested hs-tier1-6, hs-tier1-rt, stress, xcomp Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge master - remove trailing whitespace - 8358690: Some initialization code asks for AOT cache status way too early ------------- Changes: https://git.openjdk.org/jdk/pull/25763/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25763&range=01 Stats: 172 lines in 14 files changed: 99 ins; 28 del; 45 mod Patch: https://git.openjdk.org/jdk/pull/25763.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25763/head:pull/25763 PR: https://git.openjdk.org/jdk/pull/25763 From kvn at openjdk.org Sat Jun 14 00:11:49 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 14 Jun 2025 00:11:49 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: <8SCnTD5KzGioe8Kpdr2lLGPP9f3eYNWOeZYYzbHBsiE=.824b4e49-4d0d-45e3-a8b9-5e01654a3047@github.com> On Wed, 11 Jun 2025 23:08:44 GMT, Vladimir Kozlov wrote: > Thanks to @shipilev for catching the issue. > > [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. > > This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. > > We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). > > We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. > > The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. > > I also did some cleanup to match `leyden/premain` branch for easy merges. > > Tested hs-tier1-6, hs-tier1-rt, stress, xcomp I merged master which includes [JDK-8359373](https://bugs.openjdk.org/browse/JDK-8359373) change. I ran the same testing and don't see new failures any more. Good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2972008430 From syan at openjdk.org Sat Jun 14 00:55:36 2025 From: syan at openjdk.org (SendaoYan) Date: Sat, 14 Jun 2025 00:55:36 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 18:46:58 GMT, Kim Barrett wrote: >> Please review this change that makes the various code cache/heap size options >> consistently be of type size_t. >> >> The shared declarations for these options were all uintx. These options all >> may have platform-defined values. Some of those platform-specific definitions >> were uintx, some were size_t, and some were intx(!). This change makes them >> all consistently size_t. >> >> More details in the first comment. >> >> Testing: mach5 tier1-6 >> GHA testing in-progress > > Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: > > - update copyrights > - remove leftover include test/lib-test/jdk/test/whitebox/vm_flags/UintxTest.java line 40: > 38: public class UintxTest { > 39: private static final String FLAG_NAME = "VerifyGCStartAt"; > 40: private static final String FLAG_DEBUG_NAME = "StopInterpreterAt"; Hi, Does this change is a mistake. Why do we change the `FLAG_DEBUG_NAME` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2146323772 From fjiang at openjdk.org Sat Jun 14 03:30:32 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Sat, 14 Jun 2025 03:30:32 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 03:22:10 GMT, Anjian Wen wrote: >> Acquire fence removal in safepoint_poll >> >> At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. >> >> Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in >> JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence in safepoint_poll. >> >> [0] https://github.com/openjdk/jdk/pull/20420 > > Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V: delete the acquire argument in safepoint_poll since there is no use Looks fine. Thanks! ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/25709#pullrequestreview-2927057124 From kbarrett at openjdk.org Sat Jun 14 11:10:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 14 Jun 2025 11:10:34 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: References: Message-ID: <3VU2OGqP1oQpC2eKsNRur-VZgalWp9J7lpiUZrk2xB4=.fde060bd-0f96-46ca-b333-c2f463641e8d@github.com> On Sat, 14 Jun 2025 00:53:18 GMT, SendaoYan wrote: >> Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: >> >> - update copyrights >> - remove leftover include > > test/lib-test/jdk/test/whitebox/vm_flags/UintxTest.java line 40: > >> 38: public class UintxTest { >> 39: private static final String FLAG_NAME = "VerifyGCStartAt"; >> 40: private static final String FLAG_DEBUG_NAME = "StopInterpreterAt"; > > Hi, Does this change is a mistake. Why do we change the `FLAG_DEBUG_NAME` Because CodeCacheMinimumUseSpace is no longer uintx-typed, it's now size_t. So the test needs to use some other uintx-typed option. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2146426018 From mdoerr at openjdk.org Sat Jun 14 11:10:38 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 14 Jun 2025 11:10:38 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: <0-Re4fyQGaSyOl-bYm1h9LT5a0TKKrgJCHquooXOIkQ=.d6044248-6188-4705-b564-90fa3d2d7762@github.com> On Fri, 13 Jun 2025 19:18:19 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > remove is_sigill_not_entrant Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2928005394 From kvn at openjdk.org Sat Jun 14 18:29:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 14 Jun 2025 18:29:29 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: <_16KBkO-TmmTuaWG8l9ekstEdIP9BKsyXip0H0u_NHs=.3b1479e2-a0b7-41b1-a72a-12d217a3d628@github.com> Message-ID: On Thu, 12 Jun 2025 16:58:40 GMT, Andrew Dinn wrote: >>> @vnkozlov Are you suggesting moving the fence stubs to a separate StubGen preinitial blob created before the initial blob? That should be relatively easy to do. >>> >>> If you want me to push that change first in a separate PR I will be happy to do so. >> >> Yes, please. There could be other stubs we need very early which we can exclude from AOTing. May be call it `pre-universe` to be clear. >> >> My only concern is the fence stub is used only by windows-x86. Introducing whole new stubs type for that is overkill IMHO. But on other hand, we may need such type later for other new stubs. Or we find later that some initial stubs still be needed before `universe_init`. > > @vnkozlov I have raised [JDK-8359373](https://bugs.openjdk.org/browse/JDK-8359373) and have a PR in progress. @adinn and @ashu-mehra please review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2972971645 From asmehra at openjdk.org Sat Jun 14 19:56:31 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Sat, 14 Jun 2025 19:56:31 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early [v2] In-Reply-To: References: Message-ID: On Sat, 14 Jun 2025 00:11:49 GMT, Vladimir Kozlov wrote: >> Thanks to @shipilev for catching the issue. >> >> [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. >> >> This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. >> >> We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). >> >> We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. >> >> The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. >> >> I also did some cleanup to match `leyden/premain` branch for easy merges. >> >> Tested hs-tier1-6, hs-tier1-rt, stress, xcomp > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge master > - remove trailing whitespace > - 8358690: Some initialization code asks for AOT cache status way too early lgtm ------------- Marked as reviewed by asmehra (Committer). PR Review: https://git.openjdk.org/jdk/pull/25763#pullrequestreview-2928751426 From kbarrett at openjdk.org Sun Jun 15 05:21:29 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 15 Jun 2025 05:21:29 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction Message-ID: Please review this change to the HotSpot Style Guide to add discussion of how we prefer to handle initialization and destruction of non-local variables. I propose this is an editorial change, as it just documents current practice rather than suggesting a change to current practice. As such, the normal HotSpot PR process applies. The updated .html file was generated using make update-build-docs. ------------- Commit messages: - static variable inits Changes: https://git.openjdk.org/jdk/pull/25812/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25812&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8319242 Stats: 39 lines in 2 files changed: 39 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25812.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25812/head:pull/25812 PR: https://git.openjdk.org/jdk/pull/25812 From duke at openjdk.org Sun Jun 15 07:14:43 2025 From: duke at openjdk.org (Yuri Gaevsky) Date: Sun, 15 Jun 2025 07:14:43 GMT Subject: RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic [v2] In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 17:27:39 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to enable the __vectorizedMismatch_ intrinsic on RISC-V platform with RVV instructions supported. >> >> Thank you, >> -Yuri Gaevsky >> >> **Correctness checks:** >> hotspot/jtreg/compiler/{intrinsic/c1/c2}/ under QEMU-8.1 with RVV v1.0.0 and -XX:TieredStopAtLevel=1/2/3/4. > > Yuri Gaevsky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge master > - 8324124: RISC-V: implement _vectorizedMismatch intrinsic . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17750#issuecomment-2973553140 From thartmann at openjdk.org Sun Jun 15 09:07:42 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Sun, 15 Jun 2025 09:07:42 GMT Subject: [jdk25] Integrated: 8359200: Memory corruption in MStack::push In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 08:02:06 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [ed39e17e](https://github.com/openjdk/jdk/commit/ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jun 2025 and was reviewed by Aleksey Shipilev and Vladimir Kozlov. > > Thanks! This pull request has now been integrated. Changeset: 03232d4a Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/03232d4a5d6adc11df3adff8f9b2e9bf5f216b6b Stats: 94 lines in 8 files changed: 73 ins; 14 del; 7 mod 8359200: Memory corruption in MStack::push Reviewed-by: epeter, shade Backport-of: ed39e17e34a2a3fd08a3e54d8d2c309deb99f61a ------------- PR: https://git.openjdk.org/jdk/pull/25792 From kvn at openjdk.org Sun Jun 15 21:10:31 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 15 Jun 2025 21:10:31 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early [v2] In-Reply-To: References: Message-ID: On Sat, 14 Jun 2025 19:53:44 GMT, Ashutosh Mehra wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Merge master >> - remove trailing whitespace >> - 8358690: Some initialization code asks for AOT cache status way too early > > lgtm thank you, @ashu-mehra ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2974692921 From eosterlund at openjdk.org Mon Jun 16 00:59:31 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 16 Jun 2025 00:59:31 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 19:18:19 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > remove is_sigill_not_entrant Thanks for doing this! I have been wanting something like this for a while too and it looks great. I have some comments though... src/hotspot/share/gc/z/zBarrierSetNMethod.cpp line 109: > 107: } > 108: > 109: void ZBarrierSetNMethod::arm_with(nmethod* nm, int value) { I don't usually comment on names, but could we call this guard_with instead? We tried to stop saying "arm" about things used also for disarming and we have (hopefully) been consistent about calling that "guard" instead. src/hotspot/share/gc/z/zBarrierSetNMethod.cpp line 114: > 112: // Preserve the sticky bit > 113: if (is_not_entrant(nm)) { > 114: value |= not_entrant; Is it possible to have a race where another thread sets an nmethod to not entrant and the thread calling this making the nmethod entry barrier not entrant? If this was called to disarm a method and then enter it, it seems a bit sneaky in that case that we pass the nmethod entry barrier even though we under the lock see that it is not entrant. Probably okay but still feels like it might be more robust if the thread setting an nmethod to not entrant is always the one that arms the nmethod entry barrier. ------------- PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2930412450 PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2148888954 PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2148891702 From duke at openjdk.org Mon Jun 16 03:16:33 2025 From: duke at openjdk.org (duke) Date: Mon, 16 Jun 2025 03:16:33 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls [v2] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 03:22:10 GMT, Anjian Wen wrote: >> Acquire fence removal in safepoint_poll >> >> At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. >> >> Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in >> JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence in safepoint_poll. >> >> [0] https://github.com/openjdk/jdk/pull/20420 > > Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V: delete the acquire argument in safepoint_poll since there is no use @Anjian-Wen Your change (at version 47065b4a3fd1e802aa0738f277fd32f6009aa109) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25709#issuecomment-2974979012 From wenanjian at openjdk.org Mon Jun 16 03:19:33 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Mon, 16 Jun 2025 03:19:33 GMT Subject: RFR: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls In-Reply-To: <--QF_NPANf_S-X_P3YVoNQ61qGVLr7NrX_ncuWA4kzc=.35b1d5f3-76cf-46a7-8665-3461b74559fc@github.com> References: <--QF_NPANf_S-X_P3YVoNQ61qGVLr7NrX_ncuWA4kzc=.35b1d5f3-76cf-46a7-8665-3461b74559fc@github.com> Message-ID: <9bVW5H5T04rxOdAJ9pmgEFp_KHU7YEexJwOsqYw_K2s=.6a7e1beb-1790-451d-8924-88f6d58e4ea0@github.com> On Thu, 12 Jun 2025 08:31:58 GMT, Robbin Ehn wrote: >>> @dchuyko can you open a new issue and look at the acquire in downcall linker for aarch64 ? >> >> Thanks for the reminder, created JDK-8359252. It was intentional to limit the scope of the original change to JNI (due the amount of testing and usages). > >> > @dchuyko can you open a new issue and look at the acquire in downcall linker for aarch64 ? >> >> Thanks for the reminder, created JDK-8359252. It was intentional to limit the scope of the original change to JNI (due the amount of testing and usages). > > Thank you! @robehn @feilongjiang @RealFYang Thanks for your review and approve ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25709#issuecomment-2974981694 From iklam at openjdk.org Mon Jun 16 03:34:44 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 16 Jun 2025 03:34:44 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added Message-ID: Background: when writing the string table in the AOT cache, we do this: 1. Find out the number of strings in the interned string table 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. 3. Enter safepoint 4. Copy the strings into the arrays This bug happened because: - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` - JIT compiler threads may create more interned strings after step 1 This PR attempts to fix both issues. ------------- Commit messages: - 8358680: AOT cache creation fails: no strings should have been added Changes: https://git.openjdk.org/jdk/pull/25816/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25816&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358680 Stats: 66 lines in 8 files changed: 48 ins; 2 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/25816.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25816/head:pull/25816 PR: https://git.openjdk.org/jdk/pull/25816 From wenanjian at openjdk.org Mon Jun 16 03:35:35 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Mon, 16 Jun 2025 03:35:35 GMT Subject: Integrated: 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 06:33:34 GMT, Anjian Wen wrote: > Acquire fence removal in safepoint_poll > > At least in jdk11, when comes to safepoint::end, it will invoke SafepointMechanism::disarm_local_poll to change the polling_word_offset, which may cause a race when thread come to visit polling_word_offset in native_trans state, so we used to use acquire fence. > > Since the disarm_local_poll has been removed from SafepointSynchronize::end, Thread disarm itself in > JavaThread::check_special_condition_for_native_trans when trans from native. it seems that there is no need for acquire fence in safepoint_poll. > > [0] https://github.com/openjdk/jdk/pull/20420 This pull request has now been integrated. Changeset: 1a01839f Author: Anjian Wen Committer: Feilong Jiang URL: https://git.openjdk.org/jdk/commit/1a01839f8c0522a90710e101cce6ecc479a77529 Stats: 28 lines in 8 files changed: 0 ins; 19 del; 9 mod 8359105: RISC-V: No need for acquire fence in safepoint poll during JNI calls Reviewed-by: rehn, fyang, fjiang ------------- PR: https://git.openjdk.org/jdk/pull/25709 From iklam at openjdk.org Mon Jun 16 03:39:27 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 16 Jun 2025 03:39:27 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: <_GmbTOsJpAJB566p9d1GBLJexHOh48vi3P2VRiHpEgQ=.ed9a7992-72e9-415f-bb9d-98834b9c45f0@github.com> On Mon, 16 Jun 2025 03:30:40 GMT, Ioi Lam wrote: > Background: when writing the string table in the AOT cache, we do this: > > 1. Find out the number of strings in the interned string table > 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. > 3. Enter safepoint > 4. Copy the strings into the arrays > > This bug happened because: > > - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` > - JIT compiler threads may create more interned strings after step 1 > > This PR attempts to fix both issues. @shipilev I copied the part of your [JDK-8352042: [leyden] Parallel precompilation]( https://github.com/openjdk/leyden/commit/92203ef8a1d840a073958abeee3b0494d83a0780) changes from the Leyden repo that implements `CompileBroker::wait_for_no_active_tasks()`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2975005251 From kvn at openjdk.org Mon Jun 16 05:07:33 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 16 Jun 2025 05:07:33 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 03:30:40 GMT, Ioi Lam wrote: > Background: when writing the string table in the AOT cache, we do this: > > 1. Find out the number of strings in the interned string table > 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. > 3. Enter safepoint > 4. Copy the strings into the arrays > > This bug happened because: > > - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` > - JIT compiler threads may create more interned strings after step 1 > > This PR attempts to fix both issues. Can something else add string to table between you get number and safe point? How you avoid that? I assume you can't get number at safepoint because you need to allocate object array not at safepoint. Right? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2975115032 From stuefe at openjdk.org Mon Jun 16 05:09:35 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 16 Jun 2025 05:09:35 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v9] In-Reply-To: <1mV_rk0qtAWW7cX-uBHPB5l9GCfy9QsNC60WdKcflrA=.a93bf6cd-169c-4464-a5e4-d125fbcf4009@github.com> References: <1mV_rk0qtAWW7cX-uBHPB5l9GCfy9QsNC60WdKcflrA=.a93bf6cd-169c-4464-a5e4-d125fbcf4009@github.com> Message-ID: <6exOu6-3FHnBmuakpjSc8XIqEBp85Wo8QshxLYPeH-I=.8474dc51-2956-4b5a-ab9c-125dff55ea18@github.com> On Fri, 13 Jun 2025 07:33:56 GMT, Anton Artemov wrote: > > FWIW, I think that the `ssize_t` was a good first step and the `MemRes` was an experiment that would be interesting to see if how that panned out, but I didn't expect to see it in this PR. It tried to convey that in an earlier comment. I'm leaving it up to the other involved reviewers to decide if we should go with the MemRes change in this PR. > > I understand that. Keeping `ssize_t` would not introduce consistency in the codebase, as on Windows `ssize_t` is `int64_t`, which is a signed type and, as pointed out by @tstuefe, it eats half of the possible range. I am not sure about other platforms, but I suspect that would not be the case there. Therefore I decided to rely on the known things such as `size_t `. But then, if we want to report more than one kind of error, it makes sense to keep the error code separately. What caller really needs multiple error codes? We want to know a memory size, and we may get it or not. If we don't get it, I don't think it matters why we don't get it. I don't see any different actions taken *by the caller of os::xxx()* in reaction to what cog in the machine exactly failed. The end user needs a way to get detail information, but that can be done with UL logging down in the function itself. In addition, for very severe errors the function could end the JVM right away themselves, no need to even return to the caller. Example: if /proc is missing. If we don't end the JVM and return false to the caller, all he then needs to decide is if the missing value is important to him. Printing functions would not care beyond writing "no data" or some such. Java heap initialization may want to terminate the JVM early if we cannot e.g. determine the physical_memory() size. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2970579720 From dholmes at openjdk.org Mon Jun 16 05:09:35 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 16 Jun 2025 05:09:35 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v7] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 10:28:02 GMT, Kim Barrett wrote: > Note that we already have a mechanism for propagating errors in HotSpot, via CHECK/TRAPS. @kimbarrett That is not a general error reporting mechanism, it is only for checking for Java exceptions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2975117148 From iklam at openjdk.org Mon Jun 16 06:11:30 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 16 Jun 2025 06:11:30 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 05:05:12 GMT, Vladimir Kozlov wrote: > Can something else add string to table between you get number and safe point? How you avoid that? I assume you can't get number at safepoint because you need to allocate object array not at safepoint. Right? Not that I am aware of. In testing, I added `precond(THREAD->is_Java_thread()` into `StringTable::intern()` and it never failed. So I assume that this function can only be called by real Java threads or compiler threads. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2975220245 From iklam at openjdk.org Mon Jun 16 06:20:27 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 16 Jun 2025 06:20:27 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 06:08:30 GMT, Ioi Lam wrote: > > Can something else add string to table between you get number and safe point? How you avoid that? I assume you can't get number at safepoint because you need to allocate object array not at safepoint. Right? > > Not that I am aware of. In testing, I added `precond(THREAD->is_Java_thread()` into `StringTable::intern()` and it never failed. So I assume that this function can only be called by real Java threads or compiler threads. Here are all the subclasses of JavaThread: class AttachListenerThread : public JavaThread { class CompilerThread : public JavaThread { class DeoptimizeObjectsALotThread : public JavaThread { class JvmtiAgentThread : public JavaThread { class MonitorDeflationThread : public JavaThread { class NotificationThread : public JavaThread { class ServiceThread : public JavaThread { class StringDedupThread : public JavaThread { class TrainingReplayThread : public JavaThread { StringDedupThread is disabled when dumping heap objects. And the other threads probably will either be inactive, or won't be doing anything that would result in string interning. I guess someone could use jcmd to connect to the JVM at the wrong time and that might upset the archive assembly process. But that would create far worse problems than the bug that we are trying to fix here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2975238342 From kbarrett at openjdk.org Mon Jun 16 06:29:29 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 16 Jun 2025 06:29:29 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v7] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 05:06:41 GMT, David Holmes wrote: > > Note that we already have a mechanism for propagating errors in HotSpot, via CHECK/TRAPS. > > @kimbarrett That is not a general error reporting mechanism, it is only for checking for Java exceptions. Oh, you are right, I forgot about that. It seems I hardly ever deal with that mechanism, and not at all recently. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2975257136 From mhaessig at openjdk.org Mon Jun 16 07:01:47 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Mon, 16 Jun 2025 07:01:47 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction In-Reply-To: References: Message-ID: <85ChqbwlxUp_QZQrEUsikQsPB0pCeGNsHBQtioMK0ys=.8d420996-041e-4ddb-9304-87178e8ebe35@github.com> On Sun, 15 Jun 2025 05:15:11 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide to add discussion of how > we prefer to handle initialization and destruction of non-local variables. > > I propose this is an editorial change, as it just documents current practice > rather than suggesting a change to current practice. As such, the normal > HotSpot PR process applies. > > The updated .html file was generated using make update-build-docs. Thank you for working on this. I only have a suggestion that might help people who are less experienced. doc/hotspot-style.html line 802: > 800:

    Avoid variables with static storage duration and non-constant > 801: initialization, or with non-trivial destruction. Such variables can lead > 802: to the so-called "static initialization order fiasco", or its dual on Suggestion: to the so-called "static initialization order fiasco", or its dual on I did not know what this is off the top of my head so a link might be helpful. ------------- PR Review: https://git.openjdk.org/jdk/pull/25812#pullrequestreview-2930824226 PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2149168064 From kbarrett at openjdk.org Mon Jun 16 07:09:39 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 16 Jun 2025 07:09:39 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v11] In-Reply-To: References: Message-ID: > Please review this change which adds a native method providing the > implementation of Reference::get. Referece::get is an intrinsic candidate, so > this native method implementation is only used when the intrinsic is not. > > Currently there is intrinsic support by the interpreter, C1, C2, and graal, > which are always used. With this change we can later remove all the > per-platform interpreter intrinsic implementations, and might also remove the > C1 intrinsic implementation. > > Testing: > (1) mach5 tier1-6 normal (so using all the existing intrinsics). > (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: - Merge branch 'master' into native-reference-get - add pseudo-native entry for Reference.get0 - tidy CallGenerator lookup in Compile ctor - fix comment alignment - Merge branch 'master' into native-reference-get - make private native Reference.get0 the intrinsic - Merge branch 'master' into native-reference-get - Merge branch 'master' into native-reference-get - use new waitForRefProc, some tidying - Merge branch 'master' into native-reference-get - ... and 7 more: https://git.openjdk.org/jdk/compare/594b4b31...877e64ca ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24315/files - new: https://git.openjdk.org/jdk/pull/24315/files/46ba079f..877e64ca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24315&range=09-10 Stats: 39280 lines in 872 files changed: 30089 ins; 5915 del; 3276 mod Patch: https://git.openjdk.org/jdk/pull/24315.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24315/head:pull/24315 PR: https://git.openjdk.org/jdk/pull/24315 From duke at openjdk.org Mon Jun 16 07:37:29 2025 From: duke at openjdk.org (Manjunath S Matti.) Date: Mon, 16 Jun 2025 07:37:29 GMT Subject: RFR: 8359114: [s390x] Add z17 detection code [v2] In-Reply-To: <_fiM-Nhm3q5S2hCxa3quxpodBRmeIsCIBcA7AB4Hmcc=.2005b23d-fc05-4821-90b4-cf22a8d2442e@github.com> References: <_fiM-Nhm3q5S2hCxa3quxpodBRmeIsCIBcA7AB4Hmcc=.2005b23d-fc05-4821-90b4-cf22a8d2442e@github.com> Message-ID: <0BqgGbvuiWpEy5_tECYOakpwrVJBCmyOov4OKNpQZis=.3060dfca-de1f-4851-8e22-a691e91e0fda@github.com> On Wed, 11 Jun 2025 13:09:45 GMT, Manjunath S Matti. wrote: >> Add support to detect the new generation of Z machine (z17). > > Manjunath S Matti. has updated the pull request incrementally with one additional commit since the last revision: > > Correct the comments for the bits covered in DW[2] and DW[3]. @RealLucy could you please provide the much needed 2nd review? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25718#issuecomment-2975417569 From dholmes at openjdk.org Mon Jun 16 07:54:40 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 16 Jun 2025 07:54:40 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v11] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 10:14:22 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static MemRes available_memory(); >> static julong used_memory(); --> static MemRes used_memory(); >> static julong free_memory(); --> static MemRes free_memory(); >> static jlong total_swap_space(); --> static MemRes total_swap_space(); >> static jlong free_swap_space(); --> static MemRes free_swap_space(); >> static julong physical_memory(); --> static MemRes physical_memory(); >> >> >> `MemRes` is a struct containing a pair of values, `size_t value` to carry the return value, `int error` to carry the error if any. Currently, in case of error the latter is set to -1. >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Added default val to constructor of MemRes. src/hotspot/share/runtime/os.hpp line 148: > 146: }; > 147: > 148: // struct to return from mem-related functions Can you add some more commentary on this please: what do the fields mean? how are they used? Do we need an assert to check that there is no error if you read the value? It seems most places are not even checking if there may be an error. ?? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2149275862 From stefank at openjdk.org Mon Jun 16 08:08:35 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 16 Jun 2025 08:08:35 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction In-Reply-To: References: Message-ID: On Sun, 15 Jun 2025 05:15:11 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide to add discussion of how > we prefer to handle initialization and destruction of non-local variables. > > I propose this is an editorial change, as it just documents current practice > rather than suggesting a change to current practice. As such, the normal > HotSpot PR process applies. > > The updated .html file was generated using make update-build-docs. doc/hotspot-style.md line 776: > 774: > 775: Avoid variables with static storage duration and non-constant initialization, > 776: or with non-trivial destruction. Such variables can lead to the so-called I wonder about the phrase "non-constant". There are a few places where we set up static objects that are non-constant, but doesn't depend on anything. For example, `ZGC_ONLY(static ZArguments zArguments;) `. Is `non-constant` precise enough to describe what we want to prevent? doc/hotspot-style.md line 777: > 775: Avoid variables with static storage duration and non-constant initialization, > 776: or with non-trivial destruction. Such variables can lead to the so-called > 777: "static initialization order fiasco", or its dual on the destruction size. What does `destruction size` mean here? Or did you intend to write "destruction site"? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2149301958 PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2149289222 From kbarrett at openjdk.org Mon Jun 16 08:27:29 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 16 Jun 2025 08:27:29 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 07:58:27 GMT, Stefan Karlsson wrote: >> Please review this change to the HotSpot Style Guide to add discussion of how >> we prefer to handle initialization and destruction of non-local variables. >> >> I propose this is an editorial change, as it just documents current practice >> rather than suggesting a change to current practice. As such, the normal >> HotSpot PR process applies. >> >> The updated .html file was generated using make update-build-docs. > > doc/hotspot-style.md line 777: > >> 775: Avoid variables with static storage duration and non-constant initialization, >> 776: or with non-trivial destruction. Such variables can lead to the so-called >> 777: "static initialization order fiasco", or its dual on the destruction size. > > What does `destruction size` mean here? Or did you intend to write "destruction site"? Oops, I meant s/site/side/. Will fix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2149339481 From amitkumar at openjdk.org Mon Jun 16 08:29:34 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 16 Jun 2025 08:29:34 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 19:18:19 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > remove is_sigill_not_entrant Just FYI, s390 build is broken with this change: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/amit/jdk/src/hotspot/share/gc/shared/barrierSetNMethod.cpp:196), pid=1779086, tid=1779117 # assert(!nm->is_osr_method() || may_enter) failed: OSR nmethods should always be entrant after migration # # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) # Problematic frame: # V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e # # Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h %d" (or dumping to /home/amit/jdk/make/core.1779086) # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # stack trace: Stack: [0x000003ff9e580000,0x000003ff9e680000], sp=0x000003ff9e67b068, free space=1004k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e (barrierSetNMethod.cpp:196) v ~StubRoutines::method_entry_barrier 0x000003ff9050cd18 J 282% c2 sun.nio.fs.UnixPath.initOffsets()V java.base (189 bytes) @ 0x000003ff90c4f0c8 [0x000003ff90c4f080+0x0000000000000048] j sun.nio.fs.UnixPath.getFileName()Lsun/nio/fs/UnixPath;+1 java.base j sun.nio.fs.UnixFileSystemProvider.isHidden(Ljava/nio/file/Path;)Z+6 java.base j java.nio.file.Files.isHidden(Ljava/nio/file/Path;)Z+5 java.base j jdk.internal.module.ModulePath.isHidden(Ljava/nio/file/Path;)Z+1 java.base j jdk.internal.module.ModulePath.lambda$explodedPackages$0(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Z+11 java.base j jdk.internal.module.ModulePath$$Lambda+0x00000000a105cbe0.test(Ljava/lang/Object;Ljava/lang/Object;)Z+12 java.base j java.nio.file.Files.lambda$find$0(Ljava/util/function/BiPredicate;Ljava/nio/file/FileTreeWalker$Event;)Z+9 java.base j java.nio.file.Files$$Lambda+0x00000000a10646c0.test(Ljava/lang/Object;)Z+8 java.base .... ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2975564797 From sroy at openjdk.org Mon Jun 16 08:33:36 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Mon, 16 Jun 2025 08:33:36 GMT Subject: Integrated: JDK-8348574 : Simplify c1/c2_globals inclusions In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 08:32:56 GMT, Suchismith Roy wrote: > JBS Issue : [JDK-8348574](https://bugs.openjdk.org/browse/JDK-8348574) > > c1_globals.hpp includes c1_globals_pd.hpp. c1_globals_pd.hpp includes the corresponding CPU_HEADER and OS_HEADER files. All of the c1_globals_.hpp files are essentially identical and basically empty. (They just include globalDefinitions.hpp and macros.hpp, and provide nothing additional.) > > This could be simplified by having c1_globals.hpp do the CPU_HEADER inclusion directly, and remove c1_globals_pd.hpp and all c1_globals_.hpp files. > > Even if there are some non-vacuous c1_globals_.hpp files in the future, c1_globals_pd.hpp seems unwarranted; just add the OS_HEADER include directly in c1_globals.hpp. The c1_globals_pd.hpp files really don't seem worth the extra indirection. > > Similarly for c2_globals.hpp &etc. This pull request has now been integrated. Changeset: 79497ef7 Author: Suchismith Roy Committer: Varada M URL: https://git.openjdk.org/jdk/commit/79497ef7f55ef445b31348ae9d3d6dff6d3b6a54 Stats: 366 lines in 13 files changed: 2 ins; 360 del; 4 mod 8348574: Simplify c1/c2_globals inclusions Reviewed-by: mhaessig, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/25773 From chagedorn at openjdk.org Mon Jun 16 08:51:35 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 16 Jun 2025 08:51:35 GMT Subject: [jdk25] RFR: 8359327: Incorrect AVX3Threshold results into code buffer overflows on APX targets In-Reply-To: <9WHKq0JzPpVQmht3CYSh-l_PPX40wqCW5enTK-jol00=.ad7f76d7-4ed2-4495-aba1-649f8100349e@github.com> References: <9WHKq0JzPpVQmht3CYSh-l_PPX40wqCW5enTK-jol00=.ad7f76d7-4ed2-4495-aba1-649f8100349e@github.com> Message-ID: <-Ow93kLjSUWBdZkmd72N0Q1b2RoIM3jIyJC9Vb9MOHw=.fb0d1eba-c454-4b39-9437-3cddad542472@github.com> On Fri, 13 Jun 2025 09:45:40 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [e7f63ba3](https://github.com/openjdk/jdk/commit/e7f63ba3109adf614cee1bc392cfeef85e9ca778) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Jatin Bhateja on 13 Jun 2025 and was reviewed by Sandhya Viswanathan. > > Thanks! Looks good! ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25796#pullrequestreview-2931169038 From thartmann at openjdk.org Mon Jun 16 08:51:35 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 16 Jun 2025 08:51:35 GMT Subject: [jdk25] RFR: 8359327: Incorrect AVX3Threshold results into code buffer overflows on APX targets In-Reply-To: <9WHKq0JzPpVQmht3CYSh-l_PPX40wqCW5enTK-jol00=.ad7f76d7-4ed2-4495-aba1-649f8100349e@github.com> References: <9WHKq0JzPpVQmht3CYSh-l_PPX40wqCW5enTK-jol00=.ad7f76d7-4ed2-4495-aba1-649f8100349e@github.com> Message-ID: On Fri, 13 Jun 2025 09:45:40 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [e7f63ba3](https://github.com/openjdk/jdk/commit/e7f63ba3109adf614cee1bc392cfeef85e9ca778) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Jatin Bhateja on 13 Jun 2025 and was reviewed by Sandhya Viswanathan. > > Thanks! Thanks Christian! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25796#issuecomment-2975636913 From thartmann at openjdk.org Mon Jun 16 08:51:36 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 16 Jun 2025 08:51:36 GMT Subject: [jdk25] Integrated: 8359327: Incorrect AVX3Threshold results into code buffer overflows on APX targets In-Reply-To: <9WHKq0JzPpVQmht3CYSh-l_PPX40wqCW5enTK-jol00=.ad7f76d7-4ed2-4495-aba1-649f8100349e@github.com> References: <9WHKq0JzPpVQmht3CYSh-l_PPX40wqCW5enTK-jol00=.ad7f76d7-4ed2-4495-aba1-649f8100349e@github.com> Message-ID: <-B8UrbHUHUMap7HQ5jnqHCMlXk1acRmLz1TerWKSOHM=.aa40e410-70f5-4a1d-8f1a-1e8290e5cbcb@github.com> On Fri, 13 Jun 2025 09:45:40 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [e7f63ba3](https://github.com/openjdk/jdk/commit/e7f63ba3109adf614cee1bc392cfeef85e9ca778) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Jatin Bhateja on 13 Jun 2025 and was reviewed by Sandhya Viswanathan. > > Thanks! This pull request has now been integrated. Changeset: 2a329457 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/2a3294571a809a783b474cde5d344447e2981109 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8359327: Incorrect AVX3Threshold results into code buffer overflows on APX targets Reviewed-by: chagedorn Backport-of: e7f63ba3109adf614cee1bc392cfeef85e9ca778 ------------- PR: https://git.openjdk.org/jdk/pull/25796 From duke at openjdk.org Mon Jun 16 08:58:33 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 16 Jun 2025 08:58:33 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v11] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 07:51:28 GMT, David Holmes wrote: > Can you add some more commentary on this please: what do the fields mean? how are they used? Do we need an assert to check that there is no error if you read the value? The `value` field transfers a value returned by a call to memory functions (or computed by some other logic, see for instance the container case). The `error` field is introduced in order to signal an error which happened during a call to a memory function. Currently, the presence of error is encoded as -1 in that field, and the absence of error is 0. One could also do it with a boolean value. If later one wants to distinguish between different types of error, more values can be used. Having these two values separately allows to have the full range of `size_t` in the output of all functions of interest, and be able to signal if something went wrong at the same time. > It seems most places are not even checking if there may be an error. ?? This is correct, the purpose of this PR is not to enhance/improve handling of errors, but to address the issue of memory functions having different j* types where C++ doesn't interface with Java code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25450#discussion_r2149406896 From adinn at openjdk.org Mon Jun 16 09:03:32 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 16 Jun 2025 09:03:32 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early [v2] In-Reply-To: References: Message-ID: On Sat, 14 Jun 2025 00:11:49 GMT, Vladimir Kozlov wrote: >> Thanks to @shipilev for catching the issue. >> >> [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. >> >> This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. >> >> We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). >> >> We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. >> >> The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. >> >> I also did some cleanup to match `leyden/premain` branch for easy merges. >> >> Tested hs-tier1-6, hs-tier1-rt, stress, xcomp > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge master > - remove trailing whitespace > - 8358690: Some initialization code asks for AOT cache status way too early Looks good ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25763#pullrequestreview-2931221435 From duke at openjdk.org Mon Jun 16 09:08:35 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 16 Jun 2025 09:08:35 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v11] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 10:14:22 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static MemRes available_memory(); >> static julong used_memory(); --> static MemRes used_memory(); >> static julong free_memory(); --> static MemRes free_memory(); >> static jlong total_swap_space(); --> static MemRes total_swap_space(); >> static jlong free_swap_space(); --> static MemRes free_swap_space(); >> static julong physical_memory(); --> static MemRes physical_memory(); >> >> >> `MemRes` is a struct containing a pair of values, `size_t value` to carry the return value, `int error` to carry the error if any. Currently, in case of error the latter is set to -1. >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Added default val to constructor of MemRes. > What caller really needs multiple error codes? We want to know a memory size, and we may get it or not. If we don't get it, I don't think it matters why we don't get it. I don't see any different actions taken _by the caller of os::xxx()_ in reaction to what cog in the machine exactly failed. The end user needs a way to get detail information, but that can be done with UL logging down in the function itself. In addition, for very severe errors the function could end the JVM right away themselves, no need to even return to the caller. Example: if /proc is missing. If we don't really need multiple error codes, then having a struct is redundant, as the the **only** error value can be encoded as something like `std::numeric_limits::max`. I think it is safe to assume that this value will never be returned as a real memory size on any platform. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2975696887 From mli at openjdk.org Mon Jun 16 10:23:39 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 16 Jun 2025 10:23:39 GMT Subject: RFR: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> Message-ID: On Mon, 28 Apr 2025 10:34:49 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > > Before [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), when a released jdk not supportting sleef (for any reason, e.g. low gcc version, intrinsic not supported, rvv not supported, and so on) runs on machine support vector operation (e.g. on riscv, it supports rvv), it can not call into sleef, but will not fail either, it falls back to java scalar version implementation. > But after [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), it will cause an exception thrown at runtime. > > This change the behaviour of existing jdk, and it should not throw exception anyway. > > @iwanowww @RealFYang > > Thanks! I'll close this pr, as seems to us it's also helpful to push the jdk vendors to support sleef when they release. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24914#issuecomment-2975981848 From mli at openjdk.org Mon Jun 16 10:23:40 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 16 Jun 2025 10:23:40 GMT Subject: Withdrawn: 8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 In-Reply-To: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> References: <3qrRGcALYWJvERlVlpAkt0BOaYSmc27VpWLACT2NuBo=.de92478d-5bf7-4c83-8b91-2729ac531134@github.com> Message-ID: On Mon, 28 Apr 2025 10:34:49 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > > Before [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), when a released jdk not supportting sleef (for any reason, e.g. low gcc version, intrinsic not supported, rvv not supported, and so on) runs on machine support vector operation (e.g. on riscv, it supports rvv), it can not call into sleef, but will not fail either, it falls back to java scalar version implementation. > But after [JDK-8353786](https://bugs.openjdk.org/browse/JDK-8353786), it will cause an exception thrown at runtime. > > This change the behaviour of existing jdk, and it should not throw exception anyway. > > @iwanowww @RealFYang > > Thanks! This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24914 From mli at openjdk.org Mon Jun 16 10:59:40 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 16 Jun 2025 10:59:40 GMT Subject: [jdk25] RFR: 8358892: RISC-V: jvm crash when running dacapo sunflow after JDK-8352504 Message-ID: 8358892: RISC-V: jvm crash when running dacapo sunflow after JDK-8352504 ------------- Commit messages: - Backport 9d060574e5dbd13e634f00d749d0108ceff1fae8 Changes: https://git.openjdk.org/jdk/pull/25827/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25827&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358892 Stats: 1093 lines in 4 files changed: 1059 ins; 20 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/25827.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25827/head:pull/25827 PR: https://git.openjdk.org/jdk/pull/25827 From shade at openjdk.org Mon Jun 16 11:00:35 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 16 Jun 2025 11:00:35 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: <_GmbTOsJpAJB566p9d1GBLJexHOh48vi3P2VRiHpEgQ=.ed9a7992-72e9-415f-bb9d-98834b9c45f0@github.com> References: <_GmbTOsJpAJB566p9d1GBLJexHOh48vi3P2VRiHpEgQ=.ed9a7992-72e9-415f-bb9d-98834b9c45f0@github.com> Message-ID: <0Z4Tau84L3wXX7ZSGPn9WQal0-bf21fWq7RWxwBqlgY=.9d05ba45-5e63-4f84-8063-58f540a81ea6@github.com> On Mon, 16 Jun 2025 03:37:04 GMT, Ioi Lam wrote: > @shipilev I copied the part of your [JDK-8352042: [leyden] Parallel precompilation](https://github.com/openjdk/leyden/commit/92203ef8a1d840a073958abeee3b0494d83a0780) changes from the Leyden repo that implements `CompileBroker::wait_for_no_active_tasks()`. Well, I am rewriting that part in https://github.com/openjdk/leyden/pull/79. I can give you a hunk on top of this PR that would do the same thing: use the `CompileTaskWait_lock` instead. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2976093847 From fyang at openjdk.org Mon Jun 16 11:04:28 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 16 Jun 2025 11:04:28 GMT Subject: [jdk25] RFR: 8358892: RISC-V: jvm crash when running dacapo sunflow after JDK-8352504 In-Reply-To: References: Message-ID: <8O4slndcDLw1l3h1XohG3KAYXRmqJl8b2SP4ho_akdQ=.8c29ced4-0095-473c-809e-01361d630ec7@github.com> On Mon, 16 Jun 2025 10:55:07 GMT, Hamlin Li wrote: > 8358892: RISC-V: jvm crash when running dacapo sunflow after JDK-8352504 Backport looks good. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25827#pullrequestreview-2931642506 From mli at openjdk.org Mon Jun 16 11:21:33 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 16 Jun 2025 11:21:33 GMT Subject: [jdk25] Integrated: 8358892: RISC-V: jvm crash when running dacapo sunflow after JDK-8352504 In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 10:55:07 GMT, Hamlin Li wrote: > 8358892: RISC-V: jvm crash when running dacapo sunflow after JDK-8352504 This pull request has now been integrated. Changeset: d870a488 Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/d870a4888068238b3bc1fa655aed84d23aa6bb4d Stats: 1093 lines in 4 files changed: 1059 ins; 20 del; 14 mod 8358892: RISC-V: jvm crash when running dacapo sunflow after JDK-8352504 8359045: RISC-V: construct test to verify invocation of C2_MacroAssembler::enc_cmove_cmp_fp => BoolTest::ge/gt Reviewed-by: fyang Backport-of: 9d060574e5dbd13e634f00d749d0108ceff1fae8 ------------- PR: https://git.openjdk.org/jdk/pull/25827 From kbarrett at openjdk.org Mon Jun 16 12:04:12 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 16 Jun 2025 12:04:12 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: > Please review this change to the HotSpot Style Guide to add discussion of how > we prefer to handle initialization and destruction of non-local variables. > > I propose this is an editorial change, as it just documents current practice > rather than suggesting a change to current practice. As such, the normal > HotSpot PR process applies. > > The updated .html file was generated using make update-build-docs. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: better terminology, merge separate sections ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25812/files - new: https://git.openjdk.org/jdk/pull/25812/files/5e15166a..878cbfbd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25812&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25812&range=00-01 Stats: 39 lines in 2 files changed: 17 ins; 13 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/25812.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25812/head:pull/25812 PR: https://git.openjdk.org/jdk/pull/25812 From kbarrett at openjdk.org Mon Jun 16 12:04:12 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 16 Jun 2025 12:04:12 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 08:05:16 GMT, Stefan Karlsson wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> better terminology, merge separate sections > > doc/hotspot-style.md line 776: > >> 774: >> 775: Avoid variables with static storage duration and non-constant initialization, >> 776: or with non-trivial destruction. Such variables can lead to the so-called > > I wonder about the phrase "non-constant". There are a few places where we set up static objects that are non-constant, but doesn't depend on anything. For example, `ZGC_ONLY(static ZArguments zArguments;) > `. Is `non-constant` precise enough to describe what we want to prevent? The term I was looking for but didn't find earlier is "dynamic initialization" (C++14 3.6.2). (I should have remembered the terminology, since it came up in the context of `thread_local`.) I've revised the text accordingly. I also (re)discovered that that there was already some discussion of this topic later in the guide, in the "Excluded Features" section. That section also had some problems terminology problems, and was kind of misplaced. I've merged that into this new section, hopefully overall improving things. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2149809405 From azafari at openjdk.org Mon Jun 16 12:06:31 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 16 Jun 2025 12:06:31 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v40] In-Reply-To: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: > - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. > - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. > - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. > - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 86 commits: - changes after merge - Merge remote-tracking branch 'origin/master' into _8337217_nmt_VMT_with_tree - fixes after merge with master. - Merge remote-tracking branch 'origin/master' into _8337217_nmt_VMT_with_tree - Merge remote-tracking branch 'origin/master' into _8337217_nmt_VMT_with_tree - more reviews. - review comments applied - test cases for doing reserve or commit the same region twice. - style, some cleanup, VMT and regionsTree circular dep resolved - removed UseFlagInPlace test. - ... and 76 more: https://git.openjdk.org/jdk/compare/3a188726...e303ee7c ------------- Changes: https://git.openjdk.org/jdk/pull/20425/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20425&range=39 Stats: 1426 lines in 26 files changed: 555 ins; 557 del; 314 mod Patch: https://git.openjdk.org/jdk/pull/20425.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20425/head:pull/20425 PR: https://git.openjdk.org/jdk/pull/20425 From stefank at openjdk.org Mon Jun 16 12:19:29 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 16 Jun 2025 12:19:29 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 12:04:12 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to add discussion of how >> we prefer to handle initialization and destruction of non-local variables. >> >> I propose this is an editorial change, as it just documents current practice >> rather than suggesting a change to current practice. As such, the normal >> HotSpot PR process applies. >> >> The updated .html file was generated using make update-build-docs. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > better terminology, merge separate sections Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25812#pullrequestreview-2931910725 From mdoerr at openjdk.org Mon Jun 16 13:59:29 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 16 Jun 2025 13:59:29 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 08:26:38 GMT, Amit Kumar wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> remove is_sigill_not_entrant > > Just FYI, s390 build is broken with this change: > > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/home/amit/jdk/src/hotspot/share/gc/shared/barrierSetNMethod.cpp:196), pid=1779086, tid=1779117 > # assert(!nm->is_osr_method() || may_enter) failed: OSR nmethods should always be entrant after migration > # > # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) > # Problematic frame: > # V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e > # > # Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h %d" (or dumping to /home/amit/jdk/make/core.1779086) > # > # If you would like to submit a bug report, please visit: > # https://bugreport.java.com/bugreport/crash.jsp > # > > > stack trace: > > Stack: [0x000003ff9e580000,0x000003ff9e680000], sp=0x000003ff9e67b068, free space=1004k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e (barrierSetNMethod.cpp:196) > v ~StubRoutines::method_entry_barrier 0x000003ff9050cd18 > J 282% c2 sun.nio.fs.UnixPath.initOffsets()V java.base (189 bytes) @ 0x000003ff90c4f0c8 [0x000003ff90c4f080+0x0000000000000048] > j sun.nio.fs.UnixPath.getFileName()Lsun/nio/fs/UnixPath;+1 java.base > j sun.nio.fs.UnixFileSystemProvider.isHidden(Ljava/nio/file/Path;)Z+6 java.base > j java.nio.file.Files.isHidden(Ljava/nio/file/Path;)Z+5 java.base > j jdk.internal.module.ModulePath.isHidden(Ljava/nio/file/Path;)Z+1 java.base > j jdk.internal.module.ModulePath.lambda$explodedPackages$0(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Z+11 java.base > j jdk.internal.module.ModulePath$$Lambda+0x00000000a105cbe0.test(Ljava/lang/Object;Ljava/lang/Object;)Z+12 java.base > j java.nio.file.Files.lambda$find$0(Ljava/util/function/BiPredicate;Ljava/nio/file/FileTreeWalker$Event;)Z+9 java.base > j java.nio.file.Files$$Lambda+0x00000000a10646c0.test(Ljava/lang/Object;)Z+8 java.base > .... @offamitkumar: The problem is probably the initialization to -1: [`z_cfi(Z_R0_scratch, /* to be patched */ -1);`.](https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp#L183) Should be 0. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2976308103 From mdoerr at openjdk.org Mon Jun 16 13:59:30 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 16 Jun 2025 13:59:30 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 19:18:19 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > remove is_sigill_not_entrant Seems like arm32 has the same issue: https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp#L199 The init value shouldn't have the sticky bit set. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2976766766 From stuefe at openjdk.org Mon Jun 16 14:14:30 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 16 Jun 2025 14:14:30 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v11] In-Reply-To: References: Message-ID: <47jQ6A4Y4_fKu1wbkoph3EYsRLfYJZKelzUURiZT7Rc=.73e6dd35-4b2a-450f-8c06-1d6c879626c7@github.com> On Mon, 16 Jun 2025 09:06:07 GMT, Anton Artemov wrote: > > What caller really needs multiple error codes? We want to know a memory size, and we may get it or not. If we don't get it, I don't think it matters why we don't get it. I don't see any different actions taken _by the caller of os::xxx()_ in reaction to what cog in the machine exactly failed. The end user needs a way to get detail information, but that can be done with UL logging down in the function itself. In addition, for very severe errors the function could end the JVM right away themselves, no need to even return to the caller. Example: if /proc is missing. > > If we don't really need multiple error codes, then having a struct is redundant, as the the **only** error value can be encoded as something like `std::numeric_limits::max`. I think it is safe to assume that this value will never be returned as a real memory size on any platform. Absolutely. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2976821640 From iwalulya at openjdk.org Mon Jun 16 14:36:05 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 16 Jun 2025 14:36:05 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 Message-ID: Hi all, Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. Testing: Mach5 Tier 1-7 ------------- Commit messages: - clean init Changes: https://git.openjdk.org/jdk/pull/25832/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8238687 Stats: 596 lines in 16 files changed: 360 ins; 87 del; 149 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From duke at openjdk.org Mon Jun 16 14:43:06 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 16 Jun 2025 14:43:06 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM Message-ID: Hi, please consider the following changes: On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. Tested in tiers 1 - 4. ------------- Commit messages: - 8356556: Fixed whitespaces. - 8356556: Fixed trailing whitespace. - 8356556: Removed acquire argument from safepoint_poll on aarch64 Changes: https://git.openjdk.org/jdk/pull/25829/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25829&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356556 Stats: 17 lines in 8 files changed: 0 ins; 3 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/25829.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25829/head:pull/25829 PR: https://git.openjdk.org/jdk/pull/25829 From bmaillard at openjdk.org Mon Jun 16 15:03:22 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Mon, 16 Jun 2025 15:03:22 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB Message-ID: This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by shifting the flag value in `GraphKit::new_array`. ### Testing - [ ] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) - [ ] tier1-3, plus some internal testing - [x] Manual testing with values known to previously cause undefined behavior Thanks! ------------- Commit messages: - 8356865: Add test for -XX:FastAllocateSizeLimit - 8356865: Add assert for sanity check - 8356865: Add range for FastAllocateSizeLimit flag Changes: https://git.openjdk.org/jdk/pull/25834/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25834&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356865 Stats: 59 lines in 3 files changed: 59 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25834.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25834/head:pull/25834 PR: https://git.openjdk.org/jdk/pull/25834 From amitkumar at openjdk.org Mon Jun 16 15:07:29 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 16 Jun 2025 15:07:29 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 08:26:38 GMT, Amit Kumar wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> remove is_sigill_not_entrant > > Just FYI, s390 build is broken with this change: > > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/home/amit/jdk/src/hotspot/share/gc/shared/barrierSetNMethod.cpp:196), pid=1779086, tid=1779117 > # assert(!nm->is_osr_method() || may_enter) failed: OSR nmethods should always be entrant after migration > # > # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) > # Problematic frame: > # V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e > # > # Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h %d" (or dumping to /home/amit/jdk/make/core.1779086) > # > # If you would like to submit a bug report, please visit: > # https://bugreport.java.com/bugreport/crash.jsp > # > > > stack trace: > > Stack: [0x000003ff9e580000,0x000003ff9e680000], sp=0x000003ff9e67b068, free space=1004k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e (barrierSetNMethod.cpp:196) > v ~StubRoutines::method_entry_barrier 0x000003ff9050cd18 > J 282% c2 sun.nio.fs.UnixPath.initOffsets()V java.base (189 bytes) @ 0x000003ff90c4f0c8 [0x000003ff90c4f080+0x0000000000000048] > j sun.nio.fs.UnixPath.getFileName()Lsun/nio/fs/UnixPath;+1 java.base > j sun.nio.fs.UnixFileSystemProvider.isHidden(Ljava/nio/file/Path;)Z+6 java.base > j java.nio.file.Files.isHidden(Ljava/nio/file/Path;)Z+5 java.base > j jdk.internal.module.ModulePath.isHidden(Ljava/nio/file/Path;)Z+1 java.base > j jdk.internal.module.ModulePath.lambda$explodedPackages$0(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Z+11 java.base > j jdk.internal.module.ModulePath$$Lambda+0x00000000a105cbe0.test(Ljava/lang/Object;Ljava/lang/Object;)Z+12 java.base > j java.nio.file.Files.lambda$find$0(Ljava/util/function/BiPredicate;Ljava/nio/file/FileTreeWalker$Event;)Z+9 java.base > j java.nio.file.Files$$Lambda+0x00000000a10646c0.test(Ljava/lang/Object;)Z+8 java.base > .... > @offamitkumar: The problem is probably the initialization to -1: [`z_cfi(Z_R0_scratch, /* to be patched */ -1);`.](https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp#L183) Should be 0. Thank you Martin for the suggestion. @dean-long would you please add this diff, fixing s390x build. I ran tier1 test with fastdebug, test are clean; diff --git a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp index e78906708af..2d663061aec 100644 --- a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp +++ b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp @@ -180,7 +180,7 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm) { __ z_lg(Z_R0_scratch, in_bytes(bs_nm->thread_disarmed_guard_value_offset()), Z_thread); // 6 bytes // Compare to current patched value: - __ z_cfi(Z_R0_scratch, /* to be patched */ -1); // 6 bytes (2 + 4 byte imm val) + __ z_cfi(Z_R0_scratch, /* to be patched */ 0); // 6 bytes (2 + 4 byte imm val) // Conditional Jump __ z_larl(Z_R14, (Assembler::instr_len((unsigned long)LARL_ZOPC) + Assembler::instr_len((unsigned long)BCR_ZOPC)) / 2); // 6 bytes diff --git a/src/hotspot/cpu/s390/stubGenerator_s390.cpp b/src/hotspot/cpu/s390/stubGenerator_s390.cpp index d3f6540a3ea..bb1d9ce6037 100644 --- a/src/hotspot/cpu/s390/stubGenerator_s390.cpp +++ b/src/hotspot/cpu/s390/stubGenerator_s390.cpp @@ -3197,7 +3197,7 @@ class StubGenerator: public StubCodeGenerator { // VM-Call: BarrierSetNMethod::nmethod_stub_entry_barrier(address* return_address_ptr) __ call_VM_leaf(CAST_FROM_FN_PTR(address, BarrierSetNMethod::nmethod_stub_entry_barrier)); - __ z_ltr(Z_R0_scratch, Z_RET); + __ z_ltr(Z_RET, Z_RET); // VM-Call Epilogue __ restore_volatile_regs(Z_SP, frame::z_abi_160_size, true, false); ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2977015369 From kvn at openjdk.org Mon Jun 16 15:58:41 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 16 Jun 2025 15:58:41 GMT Subject: RFR: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: <_16KBkO-TmmTuaWG8l9ekstEdIP9BKsyXip0H0u_NHs=.3b1479e2-a0b7-41b1-a72a-12d217a3d628@github.com> Message-ID: On Thu, 12 Jun 2025 16:58:40 GMT, Andrew Dinn wrote: >>> @vnkozlov Are you suggesting moving the fence stubs to a separate StubGen preinitial blob created before the initial blob? That should be relatively easy to do. >>> >>> If you want me to push that change first in a separate PR I will be happy to do so. >> >> Yes, please. There could be other stubs we need very early which we can exclude from AOTing. May be call it `pre-universe` to be clear. >> >> My only concern is the fence stub is used only by windows-x86. Introducing whole new stubs type for that is overkill IMHO. But on other hand, we may need such type later for other new stubs. Or we find later that some initial stubs still be needed before `universe_init`. > > @vnkozlov I have raised [JDK-8359373](https://bugs.openjdk.org/browse/JDK-8359373) and have a PR in progress. Thank you, @adinn ------------- PR Comment: https://git.openjdk.org/jdk/pull/25763#issuecomment-2977179355 From kvn at openjdk.org Mon Jun 16 15:58:42 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 16 Jun 2025 15:58:42 GMT Subject: Integrated: 8358690: Some initialization code asks for AOT cache status way too early In-Reply-To: References: Message-ID: On Wed, 11 Jun 2025 23:08:44 GMT, Vladimir Kozlov wrote: > Thanks to @shipilev for catching the issue. > > [JDK-8350209](https://bugs.openjdk.org/browse/JDK-8350209) came with the bootstrapping problem by checking the AOT cache status way too early. Before full AOT cache init sequence runs, these checks would always reply that AOT cache is off. This causes initial stubs to never practically restored/dumped. > > This does not affect JDK 25 because [JDK-8357514](https://github.com/openjdk/jdk/commit/8184ce39a8a732352ee841fed09cae905d27643c) switched off AOT stubs generation. > > We can't resolve bootstrap issue as it is because `initial_stubs_init()` is called before `universe_init()` where AOT code cache is created. I looked why it is required (based on comments) that `initial_stubs_init()` be called before `universe_init()`. And I found that we had a special stub during HotSpot development (1997) which was used for Vtable entries population when we run with -Xcomp (or whatever was equivalent back then). We still have reference to it in the comment: [stubRoutines.cpp#L185](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stubRoutines.cpp#L185). > > We don't have that code anymore. I moved `initial_stubs_init()` after `universe_init()` and `AOTCodeCache::init2()`. I added asserts into some initial stubs to check that they are not NULL when used. I ran from hs-tier1 to hs-tier6 + hs-tier10-rt. > > The only issue I found is that `AOTCodeCache::init_early_stubs()` needs to be call separately after `initial_stubs_init()` instead of from `AOTCodeCache::init2()`. This solved bootstrap issue. > > I also did some cleanup to match `leyden/premain` branch for easy merges. > > Tested hs-tier1-6, hs-tier1-rt, stress, xcomp This pull request has now been integrated. Changeset: 6e390ef1 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/6e390ef17cf4b6134d5d53ba4e3ae8281fedb3f3 Stats: 172 lines in 14 files changed: 99 ins; 28 del; 45 mod 8358690: Some initialization code asks for AOT cache status way too early Reviewed-by: asmehra, adinn ------------- PR: https://git.openjdk.org/jdk/pull/25763 From coleenp at openjdk.org Mon Jun 16 17:37:17 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 16 Jun 2025 17:37:17 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix formatting errors. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/57321e75..c2f2f42c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=00-01 Stats: 36 lines in 8 files changed: 2 ins; 1 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From coleenp at openjdk.org Mon Jun 16 17:37:18 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 16 Jun 2025 17:37:18 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory In-Reply-To: References: Message-ID: On Fri, 16 May 2025 12:18:42 GMT, Coleen Phillimore wrote: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Dan, thank you for your first pass. I've tried to address the things you pointed out. I think I've used 'null' correctly in comments and strings though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2977460666 From coleenp at openjdk.org Mon Jun 16 17:37:19 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 16 Jun 2025 17:37:19 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 19:39:39 GMT, Daniel D. Daugherty wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix formatting errors. > > src/hotspot/share/classfile/classLoaderData.cpp line 585: > >> 583: MutexLocker m1(metaspace_lock(), Mutex::_no_safepoint_check_flag); >> 584: if (_jmethod_ids == nullptr) { >> 585: _jmethod_ids = new (mtClass) GrowableArray(32, mtClass); > > Do you want the literal `32` to be a tunable value? I don't want it tunable. I think it's fairly unlikely to grow. Even if it does, I don't think it would cause a performance problem. I could name 32 as a constant globally in classLoaderData.cpp in case we ever find we need to change the initial value. > src/hotspot/share/classfile/classLoaderData.cpp line 590: > >> 588: } >> 589: >> 590: // Method::clear_jmethod_ids removes jmethodID entries from the table which > > Perhaps the name changed during your development? > s/clear_jmethod_ids/remove_jmethod_ids/ It did change. Thank you for noticing it. > src/hotspot/share/classfile/classLoaderData.cpp line 592: > >> 590: // Method::clear_jmethod_ids removes jmethodID entries from the table which >> 591: // releases memory. >> 592: // Because native code (e.g. JVMTI agent) holding jmethod_ids may access them > > grammar: s/e.g./e.g.,/ ok. > src/hotspot/share/classfile/classLoaderData.cpp line 594: > >> 592: // Because native code (e.g. JVMTI agent) holding jmethod_ids may access them >> 593: // after the associated classes and class loader are unloaded, subsequent lookups >> 594: // for these ids will return null since they are no longer found in the table. > > Perhaps: s/null/nullptr/ I thought the convention was that we were supposed to call it `null` in the comments and `nullptr` in the code. > src/hotspot/share/classfile/classLoaderData.hpp line 319: > >> 317: void add_jmethod_id(jmethodID id); >> 318: void remove_jmethod_ids(); >> 319: GrowableArray* jmethod_ids() { return _jmethod_ids; } > > Should `jmethod_ids` still be `const`? yes. > src/hotspot/share/oops/instanceKlass.cpp line 2394: > >> 2392: } >> 2393: >> 2394: // Lookup or create a jmethodID. > > The comment on L2394 appears wrong for `create_jmethod_id_cache`. > Perhaps move it to L2404 (above get_jmethod_id() function). yes moved. > src/hotspot/share/oops/instanceKlass.cpp line 2397: > >> 2395: static jmethodID* create_jmethod_id_cache(size_t size) { >> 2396: jmethodID* jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); >> 2397: memset(jmeths, 0, (size+1)*sizeof(jmethodID)); > > nit spacing: s/size+1/size + 1/ > on two lines. fixed. The spaces match the coding style. I fixed the others that you pointed out below. > src/hotspot/share/oops/instanceKlass.cpp line 2402: > >> 2400: return jmeths; >> 2401: } >> 2402: > > nit spacing: delete extra blank line? fixed. > src/hotspot/share/oops/instanceKlass.cpp line 2404: > >> 2402: >> 2403: >> 2404: jmethodID InstanceKlass::get_jmethod_id(Method* method) { > > Should `method` be `const`? It's really unusual in our source code to pass const Metadata pointers because of the history of the code. We should probably start doing that. I'll change this to const and see if there's a fall out. > src/hotspot/share/oops/instanceKlass.cpp line 2495: > >> 2493: id == nullptr) { >> 2494: id = Method::make_jmethod_id(class_loader_data(), m); >> 2495: Atomic::release_store(&jmeths[idnum+1], id); > > nit spacing: s/size+1/size + 1/ I fixed the idnum+1 => idnum + 1 in this function. > src/hotspot/share/oops/jmethodIDTable.cpp line 29: > >> 27: #include "memory/resourceArea.hpp" >> 28: #include "oops/method.hpp" >> 29: #include "oops/jmethodIDTable.hpp" > > Please swap these two #includes into sort order. I thought the build checked this. thanks for noticing. > src/hotspot/share/oops/jmethodIDTable.cpp line 98: > >> 96: >> 97: static JmethodEntry* get_jmethod_entry(jmethodID mid) { >> 98: assert(mid != nullptr, "JNI method id should not be null"); > > Perhaps: s/null/nullptr/ > I can't remember if assert failure text output is okay to be `null`. I think the rules are comments and strings say `null` and code is `nullptr`. > src/hotspot/share/oops/jmethodIDTable.cpp line 131: > >> 129: // Add a method id to the jmethod_ids >> 130: jmethodID JmethodIDTable::make_jmethod_id(Method* m) { >> 131: bool grow_hint, clean_hint, created; > > nit: sort local variables? I moved 'created' to where it's used but grow_hint and clean_hint are in the order of the parameter list. > src/hotspot/share/oops/jmethodIDTable.cpp line 169: > >> 167: assert_locked_or_safepoint(JmethodIdCreation_lock); >> 168: JmethodEntry* result = get_jmethod_entry(jmid); >> 169: // change to table to point to the new method > > Perhaps: // Change to table entry to point to the new method. much better > src/hotspot/share/oops/jmethodIDTable.cpp line 180: > >> 178: // We need to make sure that jmethodID actually resolves to this method >> 179: // - multiple redefined versions may share jmethodID slots and if a method >> 180: // has already been rewired to a newer version we could be removing reference > > typo?: s/could be removing reference/could be clearing a reference/ clearing is better. > src/hotspot/share/oops/method.cpp line 2063: > >> 2061: >> 2062: // jmethodID handling >> 2063: // jmethodIDs are 64-bit integers that will never run out and are mapped in a table > > Should we have a `guarantee` or `assert` somewhere that the counter never wraps? Okay, I added one when we increment the jmethod_id_counter. // Update jmethodID global counter. _jmethodID_counter++; guarantee(_jmethodID_counter != 0, "must never go back to zero"); I think this will detect wraparound. > src/hotspot/share/runtime/mutexLocker.cpp line 236: > >> 234: MUTEX_DEFN(Notification_lock , PaddedMonitor, service); // used for notification thread operations >> 235: >> 236: MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs. > > Interesting. Why change from `nosafepoint-2` to `nosafepoint-1`? I can't remember. There may have been another lock held while this one was (which is why we added MUTEX_DEFL to help with that). I'll check. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150302295 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150302945 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150308058 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150309308 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150312028 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150319536 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150320490 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150321823 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150326543 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150338172 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150340203 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150344720 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150354189 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150356106 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150358453 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150366952 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150372817 From coleenp at openjdk.org Mon Jun 16 17:37:20 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 16 Jun 2025 17:37:20 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 15:43:39 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 2404: >> >>> 2402: >>> 2403: >>> 2404: jmethodID InstanceKlass::get_jmethod_id(Method* method) { >> >> Should `method` be `const`? > > It's really unusual in our source code to pass const Metadata pointers because of the history of the code. We should probably start doing that. I'll change this to const and see if there's a fall out. Much of the metadata parameter declarations aren't const-safe. Adding this one const has a big fall-out leading to some jni calls. This would be is a good project and we should have it for new code, which some of this is. I'll see if can add any consts. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150511587 From dcubed at openjdk.org Mon Jun 16 18:21:37 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 16 Jun 2025 18:21:37 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 12:04:12 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to add discussion of how >> we prefer to handle initialization and destruction of non-local variables. >> >> I propose this is an editorial change, as it just documents current practice >> rather than suggesting a change to current practice. As such, the normal >> HotSpot PR process applies. >> >> The updated .html file was generated using make update-build-docs. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > better terminology, merge separate sections Marked as reviewed by dcubed (Reviewer). Thumbs up. ------------- PR Review: https://git.openjdk.org/jdk/pull/25812#pullrequestreview-2933073380 PR Comment: https://git.openjdk.org/jdk/pull/25812#issuecomment-2977588023 From kbarrett at openjdk.org Mon Jun 16 18:52:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 16 Jun 2025 18:52:35 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: <3sJTUKOONMnjRHYKl-M6Dcx4ZDrbbOYPD5DTbgp7UuI=.0d870e8c-fe60-48ed-9f54-aab2786fe77f@github.com> References: <3sJTUKOONMnjRHYKl-M6Dcx4ZDrbbOYPD5DTbgp7UuI=.0d870e8c-fe60-48ed-9f54-aab2786fe77f@github.com> Message-ID: On Fri, 13 Jun 2025 12:31:44 GMT, Manuel H?ssig wrote: >> Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: >> >> - update copyrights >> - remove leftover include > > src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 550: > >> 548: #define ADD_SIZE_T_FLAG(name) ADD_FLAG(size_t, name, BOXED_LONG) >> 549: #define ADD_INTX_FLAG(name) ADD_FLAG(intx, name, BOXED_LONG) >> 550: #define ADD_UINTX_FLAG(name) ADD_FLAG(uintx, name, BOXED_LONG) > > Suggestion: > > #define ADD_BOOL_FLAG(name) ADD_FLAG(bool, name, BOXED_BOOLEAN) > #define ADD_INT_FLAG(name) ADD_FLAG(int, name, BOXED_LONG) > #define ADD_SIZE_T_FLAG(name) ADD_FLAG(size_t, name, BOXED_LONG) > #define ADD_INTX_FLAG(name) ADD_FLAG(intx, name, BOXED_LONG) > #define ADD_UINTX_FLAG(name) ADD_FLAG(uintx, name, BOXED_LONG) > > Feel free to ignore, but since you are already touching this, we might as well align it. I'd rather not. I'm not a fan of this kind of formatting. I moved the `ADD_FLAG` calls over to maintain the pre-existing formatting after adding the longer than anything else `ADD_SIZE_T_FLAG`, but the `ADD_FLAG` arguments were not lined up and I'd just as soon leave them that way. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2150643858 From duke at openjdk.org Mon Jun 16 19:07:30 2025 From: duke at openjdk.org (Larry Cable) Date: Mon, 16 Jun 2025 19:07:30 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: References: <3sJTUKOONMnjRHYKl-M6Dcx4ZDrbbOYPD5DTbgp7UuI=.0d870e8c-fe60-48ed-9f54-aab2786fe77f@github.com> Message-ID: On Mon, 16 Jun 2025 18:49:37 GMT, Kim Barrett wrote: >> src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp line 550: >> >>> 548: #define ADD_SIZE_T_FLAG(name) ADD_FLAG(size_t, name, BOXED_LONG) >>> 549: #define ADD_INTX_FLAG(name) ADD_FLAG(intx, name, BOXED_LONG) >>> 550: #define ADD_UINTX_FLAG(name) ADD_FLAG(uintx, name, BOXED_LONG) >> >> Suggestion: >> >> #define ADD_BOOL_FLAG(name) ADD_FLAG(bool, name, BOXED_BOOLEAN) >> #define ADD_INT_FLAG(name) ADD_FLAG(int, name, BOXED_LONG) >> #define ADD_SIZE_T_FLAG(name) ADD_FLAG(size_t, name, BOXED_LONG) >> #define ADD_INTX_FLAG(name) ADD_FLAG(intx, name, BOXED_LONG) >> #define ADD_UINTX_FLAG(name) ADD_FLAG(uintx, name, BOXED_LONG) >> >> Feel free to ignore, but since you are already touching this, we might as well align it. > > I'd rather not. I'm not a fan of this kind of formatting. I moved the `ADD_FLAG` calls over to > maintain the pre-existing formatting after adding the longer than anything else `ADD_SIZE_T_FLAG`, > but the `ADD_FLAG` arguments were not lined up and I'd just as soon leave them that way. not sure being a fan or not is sufficient reason... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2150671231 From kbarrett at openjdk.org Mon Jun 16 19:55:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 16 Jun 2025 19:55:28 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: References: <3sJTUKOONMnjRHYKl-M6Dcx4ZDrbbOYPD5DTbgp7UuI=.0d870e8c-fe60-48ed-9f54-aab2786fe77f@github.com> Message-ID: On Mon, 16 Jun 2025 19:05:12 GMT, Larry Cable wrote: >> I'd rather not. I'm not a fan of this kind of formatting. I moved the `ADD_FLAG` calls over to >> maintain the pre-existing formatting after adding the longer than anything else `ADD_SIZE_T_FLAG`, >> but the `ADD_FLAG` arguments were not lined up and I'd just as soon leave them that way. > > not sure being a fan or not is sufficient reason... Leaving it as is maintains the status quo. I'm not proposing to delete the extra whitespace in front of `ADD_FLAG`, which would be my preferred layout. And the comment does say "feel free to ignore". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25791#discussion_r2150756646 From coleenp at openjdk.org Mon Jun 16 20:36:29 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 16 Jun 2025 20:36:29 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 16:07:22 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/mutexLocker.cpp line 236: >> >>> 234: MUTEX_DEFN(Notification_lock , PaddedMonitor, service); // used for notification thread operations >>> 235: >>> 236: MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs. >> >> Interesting. Why change from `nosafepoint-2` to `nosafepoint-1`? > > I can't remember. There may have been another lock held while this one was (which is why we added MUTEX_DEFL to help with that). I'll check. This has to be nosafepoint-1 (actually can be nosafepoint) is that it must be above the rank for the ConcurrentHashTable which is nosafepoint-2. I don't know why it was nosafepoint-2 before this though, I can't find any lock ordering that requires this. In general we should use the highest lock ordering within the category (no-safepoint, safepoint) possible to leave room for further locks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2150826433 From dcubed at openjdk.org Mon Jun 16 23:23:29 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 16 Jun 2025 23:23:29 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: <89TBi0eCodG4d6T1wDUXGpYGwCKRAFIVblVH-D28xsY=.77887e94-3825-4707-8055-9202cdfa0b81@github.com> On Mon, 16 Jun 2025 17:37:17 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix formatting errors. Thumbs up with the latest version! Thanks for fixing all the nits. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2933747031 From dcubed at openjdk.org Mon Jun 16 23:23:30 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 16 Jun 2025 23:23:30 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 17:32:20 GMT, Coleen Phillimore wrote: >> It's really unusual in our source code to pass const Metadata pointers because of the history of the code. We should probably start doing that. I'll change this to const and see if there's a fall out. > > Much of the metadata parameter declarations aren't const-safe. Adding this one const has a big fall-out leading to some jni calls. This would be is a good project and we should have it for new code, which some of this is. I'll see if can add any consts. Thanks for investigating! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2151047748 From dcubed at openjdk.org Mon Jun 16 23:23:30 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 16 Jun 2025 23:23:30 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 15:34:55 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/classLoaderData.cpp line 594: >> >>> 592: // Because native code (e.g. JVMTI agent) holding jmethod_ids may access them >>> 593: // after the associated classes and class loader are unloaded, subsequent lookups >>> 594: // for these ids will return null since they are no longer found in the table. >> >> Perhaps: s/null/nullptr/ > > I thought the convention was that we were supposed to call it `null` in the comments and `nullptr` in the code. I'm not sure, but your reasoning sounds good to me! >> src/hotspot/share/oops/method.cpp line 2063: >> >>> 2061: >>> 2062: // jmethodID handling >>> 2063: // jmethodIDs are 64-bit integers that will never run out and are mapped in a table >> >> Should we have a `guarantee` or `assert` somewhere that the counter never wraps? > > Okay, I added one when we increment the jmethod_id_counter. > > // Update jmethodID global counter. > _jmethodID_counter++; > guarantee(_jmethodID_counter != 0, "must never go back to zero"); > > I think this will detect wraparound. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2151046551 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2151049919 From dlong at openjdk.org Mon Jun 16 23:34:43 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 16 Jun 2025 23:34:43 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v5] In-Reply-To: References: Message-ID: > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request incrementally with one additional commit since the last revision: s390 fix courtesy of Amit Kumar ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25764/files - new: https://git.openjdk.org/jdk/pull/25764/files/c1ebde09..a7d784b2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=03-04 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From dlong at openjdk.org Mon Jun 16 23:34:43 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 16 Jun 2025 23:34:43 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 15:04:23 GMT, Amit Kumar wrote: >> Just FYI, s390 build is broken with this change: >> >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/home/amit/jdk/src/hotspot/share/gc/shared/barrierSetNMethod.cpp:196), pid=1779086, tid=1779117 >> # assert(!nm->is_osr_method() || may_enter) failed: OSR nmethods should always be entrant after migration >> # >> # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) >> # Problematic frame: >> # V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e >> # >> # Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h %d" (or dumping to /home/amit/jdk/make/core.1779086) >> # >> # If you would like to submit a bug report, please visit: >> # https://bugreport.java.com/bugreport/crash.jsp >> # >> >> >> stack trace: >> >> Stack: [0x000003ff9e580000,0x000003ff9e680000], sp=0x000003ff9e67b068, free space=1004k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e (barrierSetNMethod.cpp:196) >> v ~StubRoutines::method_entry_barrier 0x000003ff9050cd18 >> J 282% c2 sun.nio.fs.UnixPath.initOffsets()V java.base (189 bytes) @ 0x000003ff90c4f0c8 [0x000003ff90c4f080+0x0000000000000048] >> j sun.nio.fs.UnixPath.getFileName()Lsun/nio/fs/UnixPath;+1 java.base >> j sun.nio.fs.UnixFileSystemProvider.isHidden(Ljava/nio/file/Path;)Z+6 java.base >> j java.nio.file.Files.isHidden(Ljava/nio/file/Path;)Z+5 java.base >> j jdk.internal.module.ModulePath.isHidden(Ljava/nio/file/Path;)Z+1 java.base >> j jdk.internal.module.ModulePath.lambda$explodedPackages$0(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Z+11 java.base >> j jdk.internal.module.ModulePath$$Lambda+0x00000000a105cbe0.test(Ljava/lang/Object;Ljava/lang/Object;)Z+12 java.base >> j java.nio.file.Files.lambda$find$0(Ljava/util/function/BiPredicate;Ljava/nio/file/FileTreeWalker$Event;)Z+9 java.base >> j java.nio.file.Files$$Lambda+0x00000000a10646c0.test(Ljava/lang/Object;)Z+8 java.base >> .... > >> @offamitkumar: The problem is probably the initialization to -1: [`z_cfi(Z_R0_scratch, /* to be patched */ -1);`.](https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp#L183) Should be 0. > > Thank you Martin for the suggestion. > > @dean-long would you please add this diff, fixing s390x build. I ran tier1 test with fastdebug, test are clean; > > > diff --git a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp > index e78906708af..2d663061aec 100644 > --- a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp > +++ b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp > @@ -180,7 +180,7 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm) { > __ z_lg(Z_R0_scratch, in_bytes(bs_nm->thread_disarmed_guard_value_offset()), Z_thread); // 6 bytes > > // Compare to current patched value: > - __ z_cfi(Z_R0_scratch, /* to be patched */ -1); // 6 bytes (2 + 4 byte imm val) > + __ z_cfi(Z_R0_scratch, /* to be patched */ 0); // 6 bytes (2 + 4 byte imm val) > > // Conditional Jump > __ z_larl(Z_R14, (Assembler::instr_len((unsigned long)LARL_ZOPC) + Assembler::instr_len((unsigned long)BCR_ZOPC)) / 2); // 6 bytes > diff --git a/src/hotspot/cpu/s390/stubGenerator_s390.cpp b/src/hotspot/cpu/s390/stubGenerator_s390.cpp > index d3f6540a3ea..bb1d9ce6037 100644 > --- a/src/hotspot/cpu/s390/stubGenerator_s390.cpp > +++ b/src/hotspot/cpu/s390/stubGenerator_s390.cpp > @@ -3197,7 +3197,7 @@ class StubGenerator: public StubCodeGenerator { > > // VM-Call: BarrierSetNMethod::nmethod_stub_entry_barrier(address* return_address_ptr) > __ call_VM_leaf(CAST_FROM_FN_PTR(address, BarrierSetNMethod::nmethod_stub_entry_barrier)); > - __ z_ltr(Z_R0_scratch, Z_RET); > + __ z_ltr(Z_RET, Z_RET); > > // VM-Call Epilogue > __ restore_volatile_regs(Z_SP, frame::z_abi_160_size, true, false); Thanks @offamitkumar. Could you explain the `__ z_ltr(Z_R0_scratch, Z_RET);` change, for my curiosity? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2978470825 From dlong at openjdk.org Mon Jun 16 23:39:08 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 16 Jun 2025 23:39:08 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v6] In-Reply-To: References: Message-ID: > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request incrementally with one additional commit since the last revision: arm32 fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25764/files - new: https://git.openjdk.org/jdk/pull/25764/files/a7d784b2..c98f3864 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From dlong at openjdk.org Mon Jun 16 23:39:08 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 16 Jun 2025 23:39:08 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: <-qfpN8-hyWv-QosNnOUvLaZtsI0Kr1vXsTIV6Tqvd-w=.badb50d8-3f05-41d1-bc90-d5939d6b571f@github.com> On Mon, 16 Jun 2025 13:56:30 GMT, Martin Doerr wrote: > Seems like arm32 has the same issue: > > https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp#L199 > > The init value shouldn't have the sticky bit set. Thanks, I pushed a potential fix for that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2978476161 From dlong at openjdk.org Mon Jun 16 23:45:57 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 16 Jun 2025 23:45:57 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v7] In-Reply-To: References: Message-ID: > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request incrementally with one additional commit since the last revision: rename arm_with to guard_with ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25764/files - new: https://git.openjdk.org/jdk/pull/25764/files/c98f3864..3ac6dec0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=05-06 Stats: 9 lines in 6 files changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From dlong at openjdk.org Mon Jun 16 23:45:57 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 16 Jun 2025 23:45:57 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 00:48:40 GMT, Erik ?sterlund wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> remove is_sigill_not_entrant > > src/hotspot/share/gc/z/zBarrierSetNMethod.cpp line 109: > >> 107: } >> 108: >> 109: void ZBarrierSetNMethod::arm_with(nmethod* nm, int value) { > > I don't usually comment on names, but could we call this guard_with instead? We tried to stop saying "arm" about things used also for disarming and we have (hopefully) been consistent about calling that "guard" instead. Good suggestion. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2151070290 From dlong at openjdk.org Tue Jun 17 00:05:30 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 17 Jun 2025 00:05:30 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 00:54:48 GMT, Erik ?sterlund wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> remove is_sigill_not_entrant > > src/hotspot/share/gc/z/zBarrierSetNMethod.cpp line 114: > >> 112: // Preserve the sticky bit >> 113: if (is_not_entrant(nm)) { >> 114: value |= not_entrant; > > Is it possible to have a race where another thread sets an nmethod to not entrant and the thread calling this making the nmethod entry barrier not entrant? > > If this was called to disarm a method and then enter it, it seems a bit sneaky in that case that we pass the nmethod entry barrier even though we under the lock see that it is not entrant. Probably okay but still feels like it might be more robust if the thread setting an nmethod to not entrant is always the one that arms the nmethod entry barrier. If I understand your concern correctly, there is no race. The only caller of BarrierSetNMethod::make_not_entrant() is nmethod::make_not_entrant(), and it is done inside a NMethodState_lock critical section. After a call to nmethod::make_not_entrant(), the nmethod entry barrier is armed and stays that way. And by design, a disarm only disarms at the inner nmethod_entry_barrier level, not the outer nmethod_stub_entry_barrier level. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2151084557 From dlong at openjdk.org Tue Jun 17 00:13:30 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 17 Jun 2025 00:13:30 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: <0-Re4fyQGaSyOl-bYm1h9LT5a0TKKrgJCHquooXOIkQ=.d6044248-6188-4705-b564-90fa3d2d7762@github.com> References: <0-Re4fyQGaSyOl-bYm1h9LT5a0TKKrgJCHquooXOIkQ=.d6044248-6188-4705-b564-90fa3d2d7762@github.com> Message-ID: On Sat, 14 Jun 2025 09:23:33 GMT, Martin Doerr wrote: > Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance. For ZGC I am using a per-nmethod lock: ZLocker locker(ZNMethod::lock_for_nmethod(nm)); I don't know what benchmarks to run to check the performance for functions like Deoptimization::deoptimize_all_marked, so I welcome any help with this. One possible optimization that might help is skipping the lock if the make_not_entrant call is done during a safepoint. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2978525224 From pchilanomate at openjdk.org Tue Jun 17 00:35:32 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 17 Jun 2025 00:35:32 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 13:03:11 GMT, Anton Artemov wrote: > Hi, please consider the following changes: > > On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. > > Tested in tiers 1 - 4. LGTM, thanks. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 557: > 555: > 556: void MacroAssembler::safepoint_poll(Label& slow_path, bool at_return, bool in_nmethod, Register tmp) { > 557: Nit: extra line not needed. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25829#pullrequestreview-2933828617 PR Review Comment: https://git.openjdk.org/jdk/pull/25829#discussion_r2151104532 From dholmes at openjdk.org Tue Jun 17 03:22:30 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 03:22:30 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: <210dnKnGx2ztDlrfGgi1TTu5UmwSuaK86cLk2iTb9eg=.377f2324-a42d-437f-921d-25679704274e@github.com> On Mon, 16 Jun 2025 17:37:17 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix formatting errors. This looks good. I'm still digesting it. Thanks src/hotspot/share/oops/instanceKlass.cpp line 2480: > 2478: void InstanceKlass::make_methods_jmethod_ids() { > 2479: MutexLocker ml(JmethodIdCreation_lock, Mutex::_no_safepoint_check_flag); > 2480: jmethodID* jmeths = methods_jmethod_ids_acquire(); Technically you don't need acquire semantics here as this value is not used to then access other data. But I see this is the only getter API available. ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2933991535 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2151221096 From dholmes at openjdk.org Tue Jun 17 03:22:30 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 03:22:30 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 15:51:59 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/jmethodIDTable.cpp line 98: >> >>> 96: >>> 97: static JmethodEntry* get_jmethod_entry(jmethodID mid) { >>> 98: assert(mid != nullptr, "JNI method id should not be null"); >> >> Perhaps: s/null/nullptr/ >> I can't remember if assert failure text output is okay to be `null`. > > I think the rules are comments and strings say `null` and code is `nullptr`. Yes that is the general rule. We can talk about null-ness as a concept, e..g "x must not be null", whereas `nullptr` is a C++ artifact used to check null-ness. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2151225664 From dholmes at openjdk.org Tue Jun 17 03:22:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 03:22:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: <7xxqpdg2SI6cJkzx8BJArFT9T7aHT1yR1tqskFP0dc8=.998f269d-1fc6-4ea8-806f-78efcb3c2d73@github.com> On Mon, 16 Jun 2025 23:19:32 GMT, Daniel D. Daugherty wrote: >> Okay, I added one when we increment the jmethod_id_counter. >> >> // Update jmethodID global counter. >> _jmethodID_counter++; >> guarantee(_jmethodID_counter != 0, "must never go back to zero"); >> >> I think this will detect wraparound. > > Thanks! Given it will take 584 years to wrap-around at a generation rate of 1 per nanosecond, I think we can just use an assertion here, as the only way this could fire is if we initialize the counter incorrectly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2151233626 From dholmes at openjdk.org Tue Jun 17 03:33:34 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 03:33:34 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 12:04:12 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to add discussion of how >> we prefer to handle initialization and destruction of non-local variables. >> >> I propose this is an editorial change, as it just documents current practice >> rather than suggesting a change to current practice. As such, the normal >> HotSpot PR process applies. >> >> The updated .html file was generated using make update-build-docs. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > better terminology, merge separate sections doc/hotspot-style.md line 778: > 776: [C++14 3.6.2](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf)). > 777: should be avoided, unless an implementation is permitted to perform the > 778: initialization as a static initialization. The order in which dynamic What does "an implementation" refer to here? The C++ compiler? If so how could we permit this unless all supported "implementations" are guaranteed to permit it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2151248614 From amitkumar at openjdk.org Tue Jun 17 03:46:33 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 17 Jun 2025 03:46:33 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: <-0Y5IM8MHOmPZpTHRGKK5hnBLA5TyRV871YLJ1XnSAI=.027d1976-faf5-40f7-a7d0-fa05d6b986b4@github.com> On Mon, 16 Jun 2025 15:04:23 GMT, Amit Kumar wrote: >> Just FYI, s390 build is broken with this change: >> >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/home/amit/jdk/src/hotspot/share/gc/shared/barrierSetNMethod.cpp:196), pid=1779086, tid=1779117 >> # assert(!nm->is_osr_method() || may_enter) failed: OSR nmethods should always be entrant after migration >> # >> # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) >> # Problematic frame: >> # V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e >> # >> # Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h %d" (or dumping to /home/amit/jdk/make/core.1779086) >> # >> # If you would like to submit a bug report, please visit: >> # https://bugreport.java.com/bugreport/crash.jsp >> # >> >> >> stack trace: >> >> Stack: [0x000003ff9e580000,0x000003ff9e680000], sp=0x000003ff9e67b068, free space=1004k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x40b196] BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e (barrierSetNMethod.cpp:196) >> v ~StubRoutines::method_entry_barrier 0x000003ff9050cd18 >> J 282% c2 sun.nio.fs.UnixPath.initOffsets()V java.base (189 bytes) @ 0x000003ff90c4f0c8 [0x000003ff90c4f080+0x0000000000000048] >> j sun.nio.fs.UnixPath.getFileName()Lsun/nio/fs/UnixPath;+1 java.base >> j sun.nio.fs.UnixFileSystemProvider.isHidden(Ljava/nio/file/Path;)Z+6 java.base >> j java.nio.file.Files.isHidden(Ljava/nio/file/Path;)Z+5 java.base >> j jdk.internal.module.ModulePath.isHidden(Ljava/nio/file/Path;)Z+1 java.base >> j jdk.internal.module.ModulePath.lambda$explodedPackages$0(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Z+11 java.base >> j jdk.internal.module.ModulePath$$Lambda+0x00000000a105cbe0.test(Ljava/lang/Object;Ljava/lang/Object;)Z+12 java.base >> j java.nio.file.Files.lambda$find$0(Ljava/util/function/BiPredicate;Ljava/nio/file/FileTreeWalker$Event;)Z+9 java.base >> j java.nio.file.Files$$Lambda+0x00000000a10646c0.test(Ljava/lang/Object;)Z+8 java.base >> .... > >> @offamitkumar: The problem is probably the initialization to -1: [`z_cfi(Z_R0_scratch, /* to be patched */ -1);`.](https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp#L183) Should be 0. > > Thank you Martin for the suggestion. > > @dean-long would you please add this diff, fixing s390x build. I ran tier1 test with fastdebug, test are clean; > > > diff --git a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp > index e78906708af..2d663061aec 100644 > --- a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp > +++ b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp > @@ -180,7 +180,7 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm) { > __ z_lg(Z_R0_scratch, in_bytes(bs_nm->thread_disarmed_guard_value_offset()), Z_thread); // 6 bytes > > // Compare to current patched value: > - __ z_cfi(Z_R0_scratch, /* to be patched */ -1); // 6 bytes (2 + 4 byte imm val) > + __ z_cfi(Z_R0_scratch, /* to be patched */ 0); // 6 bytes (2 + 4 byte imm val) > > // Conditional Jump > __ z_larl(Z_R14, (Assembler::instr_len((unsigned long)LARL_ZOPC) + Assembler::instr_len((unsigned long)BCR_ZOPC)) / 2); // 6 bytes > diff --git a/src/hotspot/cpu/s390/stubGenerator_s390.cpp b/src/hotspot/cpu/s390/stubGenerator_s390.cpp > index d3f6540a3ea..bb1d9ce6037 100644 > --- a/src/hotspot/cpu/s390/stubGenerator_s390.cpp > +++ b/src/hotspot/cpu/s390/stubGenerator_s390.cpp > @@ -3197,7 +3197,7 @@ class StubGenerator: public StubCodeGenerator { > > // VM-Call: BarrierSetNMethod::nmethod_stub_entry_barrier(address* return_address_ptr) > __ call_VM_leaf(CAST_FROM_FN_PTR(address, BarrierSetNMethod::nmethod_stub_entry_barrier)); > - __ z_ltr(Z_R0_scratch, Z_RET); > + __ z_ltr(Z_RET, Z_RET); > > // VM-Call Epilogue > __ restore_volatile_regs(Z_SP, frame::z_abi_160_size, true, false); > Thanks @offamitkumar. Could you explain the `__ z_ltr(Z_R0_scratch, Z_RET);` change, for my curiosity? Thanks. `ltr` instruction stands for "load and test" (32 bit). Initially we were loading the value from `Z_RET` to `Z_R0_scratch` and then it will be compared against 0. But in this case there is no requirement of loading the value in Z_R0, as it's not being used further. So we can load the value again in `Z_RET` and the compare it against 0. There is nothing wrong in previous solution, it's just killing Z_R0 for nothing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2978824615 From dholmes at openjdk.org Tue Jun 17 05:23:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 05:23:31 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 13:03:11 GMT, Anton Artemov wrote: > Hi, please consider the following changes: > > On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. > > Tested in tiers 1 - 4. Overall looks good, but one nit. Thanks src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 558: > 556: void MacroAssembler::safepoint_poll(Label& slow_path, bool at_return, bool in_nmethod, Register tmp) { > 557: > 558: // No need for acquire fence as java threads disarm themselves, ldar not needed. The comment explains the difference between the old and new code, but loses that context outside of the PR. I don't think you need to say anything here. ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25829#pullrequestreview-2934153072 PR Review Comment: https://git.openjdk.org/jdk/pull/25829#discussion_r2151333073 From aboldtch at openjdk.org Tue Jun 17 05:28:29 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 17 Jun 2025 05:28:29 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 17:37:17 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix formatting errors. src/hotspot/share/oops/jmethodIDTable.cpp line 126: > 124: static bool needs_resize(Thread* current) { > 125: return ((_jmethodID_counter > (_resize_load_trigger * table_size(current))) && > 126: !_jmethod_id_table->is_max_size_reached()); Should we not just have a separate jmethodID entry count variable we use here instead, that is incremented and decremented on insert and remove. Rather than using the next jmethodID counter which just grows monotonically regardless of any removals. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2151348302 From dholmes at openjdk.org Tue Jun 17 05:36:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 05:36:31 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v8] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: <5Mj_AplGbvIGk1IrSaSCIrtCYPuhVshy3-VE_lFJ_Jw=.31599466-59b4-4203-903b-0c0788738572@github.com> On Tue, 10 Jun 2025 18:34:00 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400810} 'baz2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.baz2(ExceptionsTest.java:142) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:135) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 0 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar2" at BCI: 6 >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 8 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:137) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400670} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 0 for thread 0x000074c46402c7b0 (m... > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora comments -- removed printing of output.getStdout() from test I think this is looking "good enough" - just one query on clearing the cache below. Thanks src/hotspot/share/gc/serial/serialHeap.cpp line 549: > 547: > 548: // Whenever a GC happens, clear the exception logging cache to avoid stale oop pointers. > 549: Exceptions::clear_logging_cache(); Shouldn't we do this prior to the GC to be extra-safe? ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2934183757 PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2151353311 From iklam at openjdk.org Tue Jun 17 06:02:30 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 17 Jun 2025 06:02:30 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v8] In-Reply-To: <5Mj_AplGbvIGk1IrSaSCIrtCYPuhVshy3-VE_lFJ_Jw=.31599466-59b4-4203-903b-0c0788738572@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> <5Mj_AplGbvIGk1IrSaSCIrtCYPuhVshy3-VE_lFJ_Jw=.31599466-59b4-4203-903b-0c0788738572@github.com> Message-ID: <_JRyKX2WCk9MAPHBjETHZ9i3arcUil9GhYOMgSQrLAU=.1ac7e732-f91d-49b1-ab07-836bc8a73a1f@github.com> On Tue, 17 Jun 2025 05:29:14 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @dholmes-ora comments -- removed printing of output.getStdout() from test > > src/hotspot/share/gc/serial/serialHeap.cpp line 549: > >> 547: >> 548: // Whenever a GC happens, clear the exception logging cache to avoid stale oop pointers. >> 549: Exceptions::clear_logging_cache(); > > Shouldn't we do this prior to the GC to be extra-safe? I am not sure where I can catch the beginning of every GC. I added the code in this function as it's called whenever the GC is going to trace or relocate all global roots, so it seems to be a good place to clear the cache (which is kind of like a global root). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2151386568 From dbriemann at openjdk.org Tue Jun 17 06:42:03 2025 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 17 Jun 2025 06:42:03 GMT Subject: RFR: 8354650: [PPC64] Try to reduce register definitions Message-ID: Defines the 32 vector registers and removes the 64 vector scalar register definitions. ------------- Commit messages: - 8354650: [PPC64] Try to reduce register definitions Changes: https://git.openjdk.org/jdk/pull/25828/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25828&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354650 Stats: 608 lines in 7 files changed: 1 ins; 194 del; 413 mod Patch: https://git.openjdk.org/jdk/pull/25828.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25828/head:pull/25828 PR: https://git.openjdk.org/jdk/pull/25828 From dbriemann at openjdk.org Tue Jun 17 06:56:06 2025 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 17 Jun 2025 06:56:06 GMT Subject: RFR: 8354650: [PPC64] Try to reduce register definitions [v2] In-Reply-To: References: Message-ID: > Defines the 32 vector registers and removes the 64 vector scalar register definitions. David Briemann has updated the pull request incrementally with one additional commit since the last revision: trigger tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25828/files - new: https://git.openjdk.org/jdk/pull/25828/files/0762e1ea..a75f0289 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25828&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25828&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25828.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25828/head:pull/25828 PR: https://git.openjdk.org/jdk/pull/25828 From dholmes at openjdk.org Tue Jun 17 07:00:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 07:00:31 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Tue, 10 Jun 2025 03:54:31 GMT, Kim Barrett wrote: > The JBS issue also talks about `copysignA` and suggests we should just use `copysign` if we're keeping `scalbnA`. Please either address that here or file a new issue for `copysignA`. It doesn't really suggest that it simply says "is there a reason to prefer copysignA over the copysign? ". I don't have an answer to that any more than I can answer the scalbnA versus scalbn question. You need to a libmath expert to answer those types of questions. All I have tried to do here is address the UB that was spotted. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2979166886 From dholmes at openjdk.org Tue Jun 17 07:19:29 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 07:19:29 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> Message-ID: On Tue, 10 Jun 2025 02:34:04 GMT, Kim Barrett wrote: >> This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. >> >> Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. >> >> Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: >> >> // Convert to unsigned to avoid signed integer overflow >> [1] unsigned u_k = ((unsigned) k) + n; >> >> [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ >> [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ >> [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); >> return x; >> } >> >> [5] if (u_k <= (unsigned)-54) { >> if (n > 50000) /* in case integer overflow in n+k */ >> return hugeX*copysignA(hugeX,x); /*overflow*/ >> else return tiny*copysignA(tiny,x); /*underflow*/ >> } >> [6] k = u_k + 54; /* subnormal result */ >> set_high(&x, (hx&0x800fffff)|(k<<20)); >> return x*twom54; >> >> >> [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition >> >> [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range >> >> [3] Again we check `u_k` and adjust the range >> >> [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression >> >> [5] We check if `u_k` is logically less than what -54 would be >> >> [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` >> >> Thanks. > > src/hotspot/share/runtime/sharedRuntimeMath.hpp line 118: > >> 116: unsigned u_k = ((unsigned) k) + n; >> 117: >> 118: if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ > > I think `(unsigned)INT_MAX` would be more explicit about what's going on. > This is also starting to push my limits on sufficiently simple to be a one-line `if`, and even more-so with my > suggested change. > I note that this isn't distinguishing between (1) `n > 0` and `k + n` overflows and wraps around to negative > `int` vs (2) `n < 0` and `k + n` is negative. And that makes later code (both pre-existing and changed) > harder to understand. I _think_ better here would be `u_k > 0x7fe && n > 0` => overflow, with some later > adjustments. Then, if the test fails and we're not huge, `k = (int)u_k;` and use `k` as before, dropping > `u_k`, so discarding the remainder of the currently proposed changes. I'm not seeing the full suggestion here. In the original code this line: if (k > 0x7fe) return hugeX*copysignA(hugeX,x); /* overflow */ is defining a logical overflow, not the wrapping overflow that we are trying to deal with. The wrapping overflow results in a negative value, which is the third case that gets handled. So AFAICS we need to use `u_k` all the way through until the end. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2151503322 From dholmes at openjdk.org Tue Jun 17 07:19:30 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 07:19:30 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> Message-ID: On Tue, 10 Jun 2025 06:30:15 GMT, Kim Barrett wrote: >> FYI, as an alternative, there is a Java-only implementation of scalb (and supporting functionality) in java.lang.Math that could be ported to C as another way to avoid this issue. > > `java.lang.Math.scalb()` doesn't seem useful here. It just transforms the > scale factor into a double power of 2 and then multiplies. It's not clear that > would result in exactly the same result for all arguments as this. And this is > (mostly) avoiding doing a double multiply for (perceived) performance reasons. > (For all I know, the complexity here could swamp the cost of a double > multiply.) Being certain of keeping the same results (on edge cases) is the > only reason to stay with this (but fixed to remove UB), rather than just > switching to using the C/C++ library `scalbn`. (And I don't know that > there *is* any potential difference between `scalbnA` and `scalbn` that would > actually matter. That would require more analysis than I've had time to do.) I don't understand your comment here Kim but I am certainly not going to mess with the existing algorithm. All I am trying to do is avoid the case where a direct `k+n` would wrap to a negative value. I have no idea what the possible range of values for `n` might be or whether we have to account for underflow as well, but that seems a distinct issue to dealing with the potential UB overflow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2151509071 From duke at openjdk.org Tue Jun 17 07:31:45 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 17 Jun 2025 07:31:45 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM [v2] In-Reply-To: References: Message-ID: <5MQqhj3QeTG7mYCcqqz_XgRa7jVmlw4xtHSgE5Umqzo=.0326bd36-543b-4c70-830b-95275ad7949a@github.com> > Hi, please consider the following changes: > > On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. > > Tested in tiers 1 - 4. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8356556: Addressed reviewers' comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25829/files - new: https://git.openjdk.org/jdk/pull/25829/files/7a639980..6120a1e3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25829&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25829&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25829.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25829/head:pull/25829 PR: https://git.openjdk.org/jdk/pull/25829 From duke at openjdk.org Tue Jun 17 07:31:45 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 17 Jun 2025 07:31:45 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 00:29:59 GMT, Patricio Chilano Mateo wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8356556: Addressed reviewers' comments. > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 557: > >> 555: >> 556: void MacroAssembler::safepoint_poll(Label& slow_path, bool at_return, bool in_nmethod, Register tmp) { >> 557: > > Nit: extra line not needed. Thanks, addressed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25829#discussion_r2151528923 From duke at openjdk.org Tue Jun 17 07:31:45 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 17 Jun 2025 07:31:45 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 05:10:30 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8356556: Addressed reviewers' comments. > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 558: > >> 556: void MacroAssembler::safepoint_poll(Label& slow_path, bool at_return, bool in_nmethod, Register tmp) { >> 557: >> 558: // No need for acquire fence as java threads disarm themselves, ldar not needed. > > The comment explains the difference between the old and new code, but loses that context outside of the PR. I don't think you need to say anything here. Agree, addressed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25829#discussion_r2151528412 From dbriemann at openjdk.org Tue Jun 17 08:29:49 2025 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 17 Jun 2025 08:29:49 GMT Subject: RFR: 8354650: [PPC64] Try to reduce register definitions [v3] In-Reply-To: References: Message-ID: > Defines the 32 vector registers and removes the 64 vector scalar register definitions. Reduces Mach registers from 398 to 270. David Briemann has updated the pull request incrementally with one additional commit since the last revision: beautify code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25828/files - new: https://git.openjdk.org/jdk/pull/25828/files/a75f0289..d29e49fc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25828&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25828&range=01-02 Stats: 17 lines in 1 file changed: 0 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/25828.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25828/head:pull/25828 PR: https://git.openjdk.org/jdk/pull/25828 From mdoerr at openjdk.org Tue Jun 17 08:29:50 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 17 Jun 2025 08:29:50 GMT Subject: RFR: 8354650: [PPC64] Try to reduce register definitions [v3] In-Reply-To: References: Message-ID: <0KgCOBpMABa6KaklbtKYj1kG1Gu-8ng_5NbiokkvSG8=.fdffef8b-9d19-4ff4-a7a0-15e92474af8c@github.com> On Tue, 17 Jun 2025 08:26:32 GMT, David Briemann wrote: >> Defines the 32 vector registers and removes the 64 vector scalar register definitions. Reduces Mach registers from 398 to 270. > > David Briemann has updated the pull request incrementally with one additional commit since the last revision: > > beautify code LGTM. Thanks for improving this! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25828#pullrequestreview-2934656765 From mhaessig at openjdk.org Tue Jun 17 08:40:38 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Tue, 17 Jun 2025 08:40:38 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 14:50:46 GMT, Beno?t Maillard wrote: > This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. > > ### Testing > - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) > - [x] tier1-3, plus some internal testing > - [x] Manual testing with values known to previously cause undefined behavior > > Thanks! Thank you for working on this, @benoitmaillard. This looks good to me. src/hotspot/share/runtime/globals.hpp line 1100: > 1098: /* Note: This value is zero mod 1<<13 for a cheap sparc set. */ \ > 1099: "Inline allocations larger than this in doublewords must go slow")\ > 1100: range(0, (1 << (BitsPerInt - LogBytesPerLong - 1)) - 1) \ It would be good to add a comment as to why this specific upper limit is necessary. ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/25834#pullrequestreview-2934700344 PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2151674726 From kbarrett at openjdk.org Tue Jun 17 08:40:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 17 Jun 2025 08:40:38 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 03:29:58 GMT, David Holmes wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> better terminology, merge separate sections > > doc/hotspot-style.md line 778: > >> 776: [C++14 3.6.2](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf)). >> 777: should be avoided, unless an implementation is permitted to perform the >> 778: initialization as a static initialization. The order in which dynamic > > What does "an implementation" refer to here? The C++ compiler? If so how could we permit this unless all supported "implementations" are guaranteed to permit it? Yes, "an implementation" is the C++ compiler. The restrictions on static initialization are significantly more stringent than we want to require. The restrictions on dynamic initialization => static initialization seem like pretty much what we would want, e.g. (roughly) doesn't affect other initializations by assignments to global variables, and doesn't depend on other initializations that are not required to be ordered before it. So even if the implementation doesn't perform the initialization statically, things should be okay. I won't guarantee there's no possibility for problems, as there may be edge cases or implementation bugs, but this seems pretty safe to me. I think the most likely way to fail is for us to not recognize an unordered dependency in our code. Keeping dynamic initializations fairly simple is probably the best way to avoid that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2151674391 From kbarrett at openjdk.org Tue Jun 17 08:42:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 17 Jun 2025 08:42:32 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: <4KgBWPrV6RvvK4qHcRCaZC3g_awRrHY9Sa0yi-h0AiU=.4c477678-c1e1-4e3e-89fd-77b0e4b7081a@github.com> On Tue, 3 Jun 2025 11:01:06 GMT, Kim Barrett wrote: >> Please review this change to permit the use of `noexcept` under certain >> circumstances in HotSpot code. >> >> http://wg21.link/n3050 >> >> Testing: >> >> JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the >> conversion would look like. It will need to be brought up to current mainline, >> possibly with modifications. >> >> This is a modification of the Style Guide, so rough consensus among the >> HotSpot Group members is required to make this change. Only Group members >> should vote for approval (via the github PR), though reasoned objections or >> comments from anyone will be considered. A decision on this proposal will not >> be made before Friday 16-June-2025 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process >> to approve (click on Review Changes > Approve), rather than sending a "vote: >> yes" email reply that would be normal for a CFV. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > more dholmes Not many votes, but it's kind of a niche topic. @vnkozlov ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2979467123 From duke at openjdk.org Tue Jun 17 09:18:49 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 17 Jun 2025 09:18:49 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v11] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 10:14:22 GMT, Anton Artemov wrote: >> Hi, >> >> in this PR the output value type for functions which return memory are changed, namely: >> >> >> static julong available_memory(); --> static MemRes available_memory(); >> static julong used_memory(); --> static MemRes used_memory(); >> static julong free_memory(); --> static MemRes free_memory(); >> static jlong total_swap_space(); --> static MemRes total_swap_space(); >> static jlong free_swap_space(); --> static MemRes free_swap_space(); >> static julong physical_memory(); --> static MemRes physical_memory(); >> >> >> `MemRes` is a struct containing a pair of values, `size_t value` to carry the return value, `int error` to carry the error if any. Currently, in case of error the latter is set to -1. >> >> The changes are done so that the other parts of the code have minimal impact. >> Tested in GHA and Tiers 1-4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8357086: Added default val to constructor of MemRes. Okay, let me summarize the findings: 1) One needs to be consistent with respect to the type of the return value, of `os::xxx()` and having `size_t` in all methods is something everyone agrees on. 2) There is a consensus that errors should be reported and handled properly, not ignored. 3) There is only one type of error. 4) The usage pattern should make it difficult for the user to ignore the error if it is reported. Given the above, I think the optimal solution is to have a boolean as a return type to indicate the error (true for no error, false for error), and the actual memory value as an in-parameter transferred by reference or pointer. The usage pattern may be enforced by `nodiscard` attribute, but it available from C++17 only. For now, one can just add if/else statements, and add he attribute later after upgrade to C++17. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2979590415 From sroy at openjdk.org Tue Jun 17 09:20:29 2025 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 17 Jun 2025 09:20:29 GMT Subject: RFR: 8354650: [PPC64] Try to reduce register definitions [v3] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:29:49 GMT, David Briemann wrote: >> Defines the 32 vector registers and removes the 64 vector scalar register definitions. Reduces Mach registers from 398 to 270. > > David Briemann has updated the pull request incrementally with one additional commit since the last revision: > > beautify code Marked as reviewed by sroy (Author). LGTM ------------- PR Review: https://git.openjdk.org/jdk/pull/25828#pullrequestreview-2934849073 PR Comment: https://git.openjdk.org/jdk/pull/25828#issuecomment-2979598146 From dbriemann at openjdk.org Tue Jun 17 10:01:34 2025 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 17 Jun 2025 10:01:34 GMT Subject: RFR: 8354650: [PPC64] Try to reduce register definitions [v3] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:29:49 GMT, David Briemann wrote: >> Defines the 32 vector registers and removes the 64 vector scalar register definitions. Reduces Mach registers from 398 to 270. > > David Briemann has updated the pull request incrementally with one additional commit since the last revision: > > beautify code Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25828#issuecomment-2979727305 From duke at openjdk.org Tue Jun 17 10:01:35 2025 From: duke at openjdk.org (duke) Date: Tue, 17 Jun 2025 10:01:35 GMT Subject: RFR: 8354650: [PPC64] Try to reduce register definitions [v3] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:29:49 GMT, David Briemann wrote: >> Defines the 32 vector registers and removes the 64 vector scalar register definitions. Reduces Mach registers from 398 to 270. > > David Briemann has updated the pull request incrementally with one additional commit since the last revision: > > beautify code @dbriemann Your change (at version d29e49fcf80345620c0e5c5b1a66689b4bbf32b7) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25828#issuecomment-2979732472 From dbriemann at openjdk.org Tue Jun 17 10:04:34 2025 From: dbriemann at openjdk.org (David Briemann) Date: Tue, 17 Jun 2025 10:04:34 GMT Subject: Integrated: 8354650: [PPC64] Try to reduce register definitions In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 12:35:47 GMT, David Briemann wrote: > Defines the 32 vector registers and removes the 64 vector scalar register definitions. Reduces Mach registers from 398 to 270. This pull request has now been integrated. Changeset: a0820828 Author: David Briemann Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/a08208283bcfe395c9962c8de3ba19fdd8cab985 Stats: 606 lines in 7 files changed: 1 ins; 194 del; 411 mod 8354650: [PPC64] Try to reduce register definitions Reviewed-by: mdoerr, sroy ------------- PR: https://git.openjdk.org/jdk/pull/25828 From azafari at openjdk.org Tue Jun 17 10:17:06 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 17 Jun 2025 10:17:06 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v41] In-Reply-To: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: <000AGCwbLZRbmxAGfatjpOnEFz_Ym2fMmDBaHgvV96w=.6b3c0c5c-0480-4036-8052-1b7ba8dabfd3@github.com> > - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. > - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. > - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. > - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: fixes to a few failures. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20425/files - new: https://git.openjdk.org/jdk/pull/20425/files/e303ee7c..815092d2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20425&range=40 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20425&range=39-40 Stats: 49 lines in 6 files changed: 19 ins; 13 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/20425.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20425/head:pull/20425 PR: https://git.openjdk.org/jdk/pull/20425 From duke at openjdk.org Tue Jun 17 10:19:12 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 17 Jun 2025 10:19:12 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v12] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static MemRes available_memory(); > static julong used_memory(); --> static MemRes used_memory(); > static julong free_memory(); --> static MemRes free_memory(); > static jlong total_swap_space(); --> static MemRes total_swap_space(); > static jlong free_swap_space(); --> static MemRes free_swap_space(); > static julong physical_memory(); --> static MemRes physical_memory(); > > > `MemRes` is a struct containing a pair of values, `size_t value` to carry the return value, `int error` to carry the error if any. Currently, in case of error the latter is set to -1. > > The changes are done so that the other parts of the code have minimal impact. > Tested in GHA and Tiers 1-4. Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs - 8357086: Added default val to constructor of MemRes. - 8357086: Addressed the reviewers comments. - 8357086: Fixed merge conflict. - 8357086: Changed returm type to struct. - 8357086: Return size_t from swap mem funcs, added checks. - 8357086: Added missed casts. - 8357086: Changed return type for total_swap_space and free_swap_space to ssize_t - Merge remote-tracking branch 'origin/master' into JDK-8357086_size_t_memfuncs - 8357086: Fixed spaces in formatting in gc-related code. - ... and 10 more: https://git.openjdk.org/jdk/compare/a0820828...cb762465 ------------- Changes: https://git.openjdk.org/jdk/pull/25450/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=11 Stats: 205 lines in 22 files changed: 46 ins; 2 del; 157 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From coleenp at openjdk.org Tue Jun 17 12:31:30 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 12:31:30 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: <210dnKnGx2ztDlrfGgi1TTu5UmwSuaK86cLk2iTb9eg=.377f2324-a42d-437f-921d-25679704274e@github.com> References: <210dnKnGx2ztDlrfGgi1TTu5UmwSuaK86cLk2iTb9eg=.377f2324-a42d-437f-921d-25679704274e@github.com> Message-ID: On Tue, 17 Jun 2025 02:53:12 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix formatting errors. > > src/hotspot/share/oops/instanceKlass.cpp line 2480: > >> 2478: void InstanceKlass::make_methods_jmethod_ids() { >> 2479: MutexLocker ml(JmethodIdCreation_lock, Mutex::_no_safepoint_check_flag); >> 2480: jmethodID* jmeths = methods_jmethod_ids_acquire(); > > Technically you don't need acquire semantics here as this value is not used to then access other data. But I see this is the only getter API available. Yes, this does need an acquire getter outside the lock. It's better to not have a non-acquire version to be possibly used by accident. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2152140447 From coleenp at openjdk.org Tue Jun 17 12:31:31 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 12:31:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 05:24:57 GMT, Axel Boldt-Christmas wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix formatting errors. > > src/hotspot/share/oops/jmethodIDTable.cpp line 126: > >> 124: static bool needs_resize(Thread* current) { >> 125: return ((_jmethodID_counter > (_resize_load_trigger * table_size(current))) && >> 126: !_jmethod_id_table->is_max_size_reached()); > > Should we not just have a separate jmethodID entry count variable we use here instead, that is incremented and decremented on insert and remove. Rather than using the next jmethodID counter which just grows monotonically regardless of any removals. If we remove a jmethodID, we need to keep the number for it in case some JVMTI code still thinks that number is valid. So we can't decrement the entry count. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2152136500 From coleenp at openjdk.org Tue Jun 17 12:49:34 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 12:49:34 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v4] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Mon, 9 Jun 2025 13:58:09 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move it to public This looks good. Looking forward to future improvements in this code. Thank you! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25298#pullrequestreview-2935521048 From dholmes at openjdk.org Tue Jun 17 12:51:29 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 12:51:29 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: <210dnKnGx2ztDlrfGgi1TTu5UmwSuaK86cLk2iTb9eg=.377f2324-a42d-437f-921d-25679704274e@github.com> Message-ID: On Tue, 17 Jun 2025 12:29:03 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 2480: >> >>> 2478: void InstanceKlass::make_methods_jmethod_ids() { >>> 2479: MutexLocker ml(JmethodIdCreation_lock, Mutex::_no_safepoint_check_flag); >>> 2480: jmethodID* jmeths = methods_jmethod_ids_acquire(); >> >> Technically you don't need acquire semantics here as this value is not used to then access other data. But I see this is the only getter API available. > > Yes, this does need an acquire getter outside the lock. It's better to not have a non-acquire version to be possibly used by accident. Acquire is only ever needed outside the lock. I don't like there only being acquire/release available generally speaking because it just confuses what memory operations are being synchronized. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2152188477 From mdoerr at openjdk.org Tue Jun 17 12:52:34 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 17 Jun 2025 12:52:34 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: <-qfpN8-hyWv-QosNnOUvLaZtsI0Kr1vXsTIV6Tqvd-w=.badb50d8-3f05-41d1-bc90-d5939d6b571f@github.com> References: <-qfpN8-hyWv-QosNnOUvLaZtsI0Kr1vXsTIV6Tqvd-w=.badb50d8-3f05-41d1-bc90-d5939d6b571f@github.com> Message-ID: On Mon, 16 Jun 2025 23:35:43 GMT, Dean Long wrote: > > Seems like arm32 has the same issue: > > https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp#L199 > > > > The init value shouldn't have the sticky bit set. > > Thanks, I pushed a potential fix for that. Unfortunately, 0xBEAFDEAD also has the MSB set. Shouldn't we better use 0 like on all other platforms? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2980261201 From dholmes at openjdk.org Tue Jun 17 12:54:30 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 12:54:30 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:36:29 GMT, Kim Barrett wrote: >> doc/hotspot-style.md line 778: >> >>> 776: [C++14 3.6.2](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf)). >>> 777: should be avoided, unless an implementation is permitted to perform the >>> 778: initialization as a static initialization. The order in which dynamic >> >> What does "an implementation" refer to here? The C++ compiler? If so how could we permit this unless all supported "implementations" are guaranteed to permit it? > > Yes, "an implementation" is the C++ compiler. > > The restrictions on static initialization are significantly more stringent > than we want to require. The restrictions on dynamic initialization => static > initialization seem like pretty much what we would want, e.g. (roughly) > doesn't affect other initializations by assignments to global variables, and > doesn't depend on other initializations that are not required to be ordered > before it. So even if the implementation doesn't perform the initialization > statically, things should be okay. I won't guarantee there's no possibility > for problems, as there may be edge cases or implementation bugs, but this > seems pretty safe to me. I think the most likely way to fail is for us to > not recognize an unordered dependency in our code. Keeping dynamic > initializations fairly simple is probably the best way to avoid that. Sorry but I'm having trouble understanding how somehow could evaluate whether a dynamic initialization is okay based on the current statement. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2152196378 From dholmes at openjdk.org Tue Jun 17 12:58:28 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 17 Jun 2025 12:58:28 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM [v2] In-Reply-To: <5MQqhj3QeTG7mYCcqqz_XgRa7jVmlw4xtHSgE5Umqzo=.0326bd36-543b-4c70-830b-95275ad7949a@github.com> References: <5MQqhj3QeTG7mYCcqqz_XgRa7jVmlw4xtHSgE5Umqzo=.0326bd36-543b-4c70-830b-95275ad7949a@github.com> Message-ID: On Tue, 17 Jun 2025 07:31:45 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. >> >> Tested in tiers 1 - 4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8356556: Addressed reviewers' comments. LGTM. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25829#pullrequestreview-2935556664 From coleenp at openjdk.org Tue Jun 17 12:59:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 12:59:54 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v3] In-Reply-To: References: Message-ID: <5RiyOu1SrdvAOEmpnPYQjcvu-faSm7wiDQpq1HJKXzo=.2b66a579-3cc2-4d90-93a0-b33c14496aba@github.com> > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - I meant HandshakeState_lock. - Add comment for JmethodIdCreation_lock about why it's rank safepoint-1. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/c2f2f42c..067046bf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From coleenp at openjdk.org Tue Jun 17 12:59:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 12:59:54 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v3] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 20:33:08 GMT, Coleen Phillimore wrote: >> I can't remember. There may have been another lock held while this one was (which is why we added MUTEX_DEFL to help with that). I'll check. > > This has to be nosafepoint-1 (actually can be nosafepoint) is that it must be above the rank for the ConcurrentHashTable which is nosafepoint-2. I don't know why it was nosafepoint-2 before this though, I can't find any lock ordering that requires this. > > In general we should use the highest lock ordering within the category (no-safepoint, safepoint) possible to leave room for further locks. Found why it needed to be nosafepoint-1. The DEFL macro is good for the global locks but HandshakeState_lock is not global since there's one per handshake operation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2152204429 From duke at openjdk.org Tue Jun 17 13:08:30 2025 From: duke at openjdk.org (duke) Date: Tue, 17 Jun 2025 13:08:30 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM [v2] In-Reply-To: <5MQqhj3QeTG7mYCcqqz_XgRa7jVmlw4xtHSgE5Umqzo=.0326bd36-543b-4c70-830b-95275ad7949a@github.com> References: <5MQqhj3QeTG7mYCcqqz_XgRa7jVmlw4xtHSgE5Umqzo=.0326bd36-543b-4c70-830b-95275ad7949a@github.com> Message-ID: <2cszvhSFJkwTJ9wOQtnPlGytobXtDHEGeaET4Vi3Gj4=.8e4ca394-2642-410e-9a4c-94c76e4885b5@github.com> On Tue, 17 Jun 2025 07:31:45 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. >> >> Tested in tiers 1 - 4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8356556: Addressed reviewers' comments. @toxaart Your change (at version 6120a1e3178a1480b9feabaa4166335999706e94) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25829#issuecomment-2980315724 From iwalulya at openjdk.org Tue Jun 17 13:43:45 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 17 Jun 2025 13:43:45 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v2] In-Reply-To: References: Message-ID: > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: remove unrequired changes - kim ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25832/files - new: https://git.openjdk.org/jdk/pull/25832/files/65730422..81a1e189 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=00-01 Stats: 24 lines in 2 files changed: 2 ins; 2 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From coleenp at openjdk.org Tue Jun 17 13:49:35 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 13:49:35 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM [v2] In-Reply-To: <5MQqhj3QeTG7mYCcqqz_XgRa7jVmlw4xtHSgE5Umqzo=.0326bd36-543b-4c70-830b-95275ad7949a@github.com> References: <5MQqhj3QeTG7mYCcqqz_XgRa7jVmlw4xtHSgE5Umqzo=.0326bd36-543b-4c70-830b-95275ad7949a@github.com> Message-ID: On Tue, 17 Jun 2025 07:31:45 GMT, Anton Artemov wrote: >> Hi, please consider the following changes: >> >> On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. >> >> Tested in tiers 1 - 4. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8356556: Addressed reviewers' comments. It also looks good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25829#issuecomment-2980451850 From duke at openjdk.org Tue Jun 17 13:49:36 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 17 Jun 2025 13:49:36 GMT Subject: Integrated: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 13:03:11 GMT, Anton Artemov wrote: > Hi, please consider the following changes: > > On AArch64, the `acquire` bool argument of the `safepoint_poll()` method is removed, as the load-acquire `ldar` instruction in the `safepoint_poll()` implementation is not needed completely. It was used for observing the disarmed value when VMThread was used to disarm the Java threads, but currently JavaThreads disarm themselves. > > Tested in tiers 1 - 4. This pull request has now been integrated. Changeset: c1deb9ee Author: Anton Artemov Committer: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/c1deb9eebf1adecffe5b205486477009ec2f7348 Stats: 17 lines in 8 files changed: 0 ins; 5 del; 12 mod 8356556: AArch64: No need for acquire fence in safepoint poll in FFM Reviewed-by: dholmes, pchilanomate ------------- PR: https://git.openjdk.org/jdk/pull/25829 From coleenp at openjdk.org Tue Jun 17 13:59:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 13:59:54 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v4] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Use load_acquire only in the places that need it. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/067046bf..41d49607 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=02-03 Stats: 23 lines in 3 files changed: 9 ins; 8 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From aboldtch at openjdk.org Tue Jun 17 13:59:54 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 17 Jun 2025 13:59:54 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 12:27:02 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/jmethodIDTable.cpp line 126: >> >>> 124: static bool needs_resize(Thread* current) { >>> 125: return ((_jmethodID_counter > (_resize_load_trigger * table_size(current))) && >>> 126: !_jmethod_id_table->is_max_size_reached()); >> >> Should we not just have a separate jmethodID entry count variable we use here instead, that is incremented and decremented on insert and remove. Rather than using the next jmethodID counter which just grows monotonically regardless of any removals. > > If we remove a jmethodID, we need to keep the number for it in case some JVMTI code still thinks that number is valid. So we can't decrement the entry count. That is not was I was trying to propose. What I tried to describe was this: ```c++ // The value of the next jmethodID. This only increments (always unique IDs) static uint64_t _jmethodID_counter = 0; // Tracks the number of jmethodID entries in the _jmethod_id_table. // Incremented on insert, decremented on remove. Use to track if we need to resize the table. static uint64_t _jmethodID_entry_count = 0; The problem with using `_jmethodID_counter` as a proxy for how many entries there are in the table is that it will diverge over time as we keep calling remove due to class unloading. Using a separate variable lets us resize based on what is actual in the table. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2152350643 From jsjolen at openjdk.org Tue Jun 17 14:04:48 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 17 Jun 2025 14:04:48 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v41] In-Reply-To: <000AGCwbLZRbmxAGfatjpOnEFz_Ym2fMmDBaHgvV96w=.6b3c0c5c-0480-4036-8052-1b7ba8dabfd3@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> <000AGCwbLZRbmxAGfatjpOnEFz_Ym2fMmDBaHgvV96w=.6b3c0c5c-0480-4036-8052-1b7ba8dabfd3@github.com> Message-ID: On Tue, 17 Jun 2025 10:17:06 GMT, Afshin Zafari wrote: >> - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. >> - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. >> - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. >> - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > fixes to a few failures. LGTM! ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20425#pullrequestreview-2935814156 From eastigeevich at openjdk.org Tue Jun 17 14:09:33 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 17 Jun 2025 14:09:33 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 15:30:48 GMT, Mikhail Ablakatov wrote: >> In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. >> >> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. >> >> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: >> >> | Metric | Before | After | Difference | >> |-------------|---------------|---------------|------------| >> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | >> | | Sum: 6653848 | Sum: 6616344 | -0.56% | >> | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | >> | | Sum: 364376 | Sum: 308552 | -15.33% | >> >> Full jtreg passed on AArch64. > > Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: > > cleanup: update a copyright notice > > Co-authored-by: Andrew Haley LGTM ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/25702#pullrequestreview-2935834985 From mablakatov at openjdk.org Tue Jun 17 14:09:34 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Tue, 17 Jun 2025 14:09:34 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 15:30:48 GMT, Mikhail Ablakatov wrote: >> In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. >> >> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. >> >> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: >> >> | Metric | Before | After | Difference | >> |-------------|---------------|---------------|------------| >> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | >> | | Sum: 6653848 | Sum: 6616344 | -0.56% | >> | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | >> | | Sum: 364376 | Sum: 308552 | -15.33% | >> >> Full jtreg passed on AArch64. > > Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: > > cleanup: update a copyright notice > > Co-authored-by: Andrew Haley Hey @eastig , when you have a moment, could you take a look at this as a second reviewer? I'd appreciate your feedback! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25702#issuecomment-2980263414 From duke at openjdk.org Tue Jun 17 14:32:35 2025 From: duke at openjdk.org (duke) Date: Tue, 17 Jun 2025 14:32:35 GMT Subject: Withdrawn: 8355013: GrowableArray default constructor should not allocate In-Reply-To: References: Message-ID: <5CJ-KT3LZZhjt8VNhVCcx9YaL2Rx6QZk2M3xqD9eOTQ=.91883aa7-50b0-4339-83bf-7a618f1bbcf4@github.com> On Fri, 18 Apr 2025 05:35:05 GMT, Quan Anh Mai wrote: > Hi, > > This patch changes the default constructors of `GrowableArray` so that it does not allocate. This is helpful because sometimes we create a `GrowableArray` and append another into it immediately, or create a `GrowableArray` to merge the value from several branches. In these cases, the default allocation is not needed. This also aligns the behaviour with that of `std::vector`, which does not allocate for default construction. > > Please take a look and leave your reviews, thanks a lot. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24748 From shade at openjdk.org Tue Jun 17 15:52:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 17 Jun 2025 15:52:29 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 03:30:40 GMT, Ioi Lam wrote: > Background: when writing the string table in the AOT cache, we do this: > > 1. Find out the number of strings in the interned string table > 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. > 3. Enter safepoint > 4. Copy the strings into the arrays > > This bug happened because: > > - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` > - JIT compiler threads may create more interned strings after step 1 > > This PR attempts to fix both issues. Actually, I need https://github.com/openjdk/jdk/pull/25409 in mainline first :) Would be nice to avoid divergence between JDK 25, JDK 26 and Leyden/premain. Anyhow, I now think this fix in incomplete. In premain, we use this `wait_for_no_active_tasks`, because we _know_ all the compiler tasks were queued, and we just need to run them down. But here, we still have a race: compiler threads may finish current batch of compilations, `wait_for_no_active_tasks` would return, and _then_ we can start compiling again. I think we need to figure our when we dump the shared table. Maybe even shutdown the compiler right before going into CDS dump? See how `CompileBroker::set_should_block` and `VM_Exit::wait_for_threads_in_native_to_block` do it. It would still be pretty awkward for a fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2977450172 PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2980907758 From iklam at openjdk.org Tue Jun 17 15:52:30 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 17 Jun 2025 15:52:30 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 17:29:24 GMT, Aleksey Shipilev wrote: > Actually, I need https://github.com/openjdk/jdk/pull/25409 in mainline first :) I can wait for that for JDK 26. If I want to fix for JDK 25, will the current fix be good enough? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2978126273 From iklam at openjdk.org Tue Jun 17 16:24:30 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 17 Jun 2025 16:24:30 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 15:49:38 GMT, Aleksey Shipilev wrote: > Would be nice to avoid divergence between JDK 25, JDK 26 and Leyden/premain. > > Anyhow, I now think this fix in incomplete. In premain, we use this `wait_for_no_active_tasks`, because we _know_ all the compiler tasks were queued, and we just need to run them down. But here, we still have a race: compiler threads may finish current batch of compilations, `wait_for_no_active_tasks` would return, and _then_ we can start compiling again. No Java code is executing at this point (we are in the only thread that can run Java code). Is there still a possibility for new compile tasks to be added? > I think we need to figure our when we dump the shared table. Maybe even shutdown the compiler right before going into CDS dump? See how `CompileBroker::set_should_block` and `VM_Exit::wait_for_threads_in_native_to_block` do it. It would still be pretty awkward for a fix. In Leyden, we run the AOT compiler after the CDS dumping has finished, so if we shut down the compiler we would have to restart it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2981014094 From matsaave at openjdk.org Tue Jun 17 16:32:33 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 17 Jun 2025 16:32:33 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v4] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Mon, 9 Jun 2025 13:58:09 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move it to public Changes look good, thanks for doing this! src/hotspot/share/oops/constantPool.hpp line 85: > 83: u2 _argument_count; > 84: > 85: const u2* argument_indexes() const { I agree with Lois that some comments here would be useful. I think the original comment can stay where it is and you can add some extra details here. ------------- Marked as reviewed by matsaave (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25298#pullrequestreview-2936238823 PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2152626054 From eastigeevich at openjdk.org Tue Jun 17 16:42:32 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 17 Jun 2025 16:42:32 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 14:50:46 GMT, Beno?t Maillard wrote: > This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. > > ### Testing > - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) > - [x] tier1-3, plus some internal testing > - [x] Manual testing with values known to previously cause undefined behavior > > Thanks! src/hotspot/share/opto/graphKit.cpp line 3807: > 3805: int log2_esize = Klass::layout_helper_log2_element_size(layout_con); > 3806: fast_size_limit <<= (LogBytesPerLong - log2_esize); > 3807: assert (fast_size_limit > 0, "increasing the size limit should not produce negative values"); Prior C++14 left shit producing a negative value is undefined behavior: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2161.pdf Do we compile c++ source specifying the C++ standard? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2152715500 From gziemski at openjdk.org Tue Jun 17 16:46:48 2025 From: gziemski at openjdk.org (Gerard Ziemski) Date: Tue, 17 Jun 2025 16:46:48 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v41] In-Reply-To: <000AGCwbLZRbmxAGfatjpOnEFz_Ym2fMmDBaHgvV96w=.6b3c0c5c-0480-4036-8052-1b7ba8dabfd3@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> <000AGCwbLZRbmxAGfatjpOnEFz_Ym2fMmDBaHgvV96w=.6b3c0c5c-0480-4036-8052-1b7ba8dabfd3@github.com> Message-ID: On Tue, 17 Jun 2025 10:17:06 GMT, Afshin Zafari wrote: >> - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. >> - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. >> - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. >> - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > fixes to a few failures. Small changes (copyright years) and one question, otherwise LGTM. Nice! Marked as reviewed by gziemski (Reviewer). src/hotspot/share/nmt/regionsTree.cpp line 2: > 1: /* > 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. Copyright year src/hotspot/share/nmt/regionsTree.hpp line 2: > 1: /* > 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. Copyright year test/hotspot/gtest/runtime/test_virtualMemoryTracker.cpp line 259: > 257: static void test_add_committed_region_overlapping() { > 258: RegionsTree* rtree = VirtualMemoryTracker::Instance::tree(); > 259: rtree->tree().remove_all(); Why are we calling `remove_all()` right after we create the tree? ------------- Marked as reviewed by gziemski (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20425#pullrequestreview-2936231261 PR Review: https://git.openjdk.org/jdk/pull/20425#pullrequestreview-2936387120 PR Review Comment: https://git.openjdk.org/jdk/pull/20425#discussion_r2152711203 PR Review Comment: https://git.openjdk.org/jdk/pull/20425#discussion_r2152711676 PR Review Comment: https://git.openjdk.org/jdk/pull/20425#discussion_r2152621722 From kbarrett at openjdk.org Tue Jun 17 16:55:29 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 17 Jun 2025 16:55:29 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 12:52:08 GMT, David Holmes wrote: >> Yes, "an implementation" is the C++ compiler. >> >> The restrictions on static initialization are significantly more stringent >> than we want to require. The restrictions on dynamic initialization => static >> initialization seem like pretty much what we would want, e.g. (roughly) >> doesn't affect other initializations by assignments to global variables, and >> doesn't depend on other initializations that are not required to be ordered >> before it. So even if the implementation doesn't perform the initialization >> statically, things should be okay. I won't guarantee there's no possibility >> for problems, as there may be edge cases or implementation bugs, but this >> seems pretty safe to me. I think the most likely way to fail is for us to >> not recognize an unordered dependency in our code. Keeping dynamic >> initializations fairly simple is probably the best way to avoid that. > > Sorry but I'm having trouble understanding how somehow could evaluate whether a dynamic initialization is okay based on the current statement. Does the initialization modify some other global state? Or does it depend on some other global state that might not be knowable yet? Avoid those. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2152737085 From dlong at openjdk.org Tue Jun 17 17:07:35 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 17 Jun 2025 17:07:35 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 16:39:56 GMT, Evgeny Astigeevich wrote: >> This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. >> >> ### Testing >> - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) >> - [x] tier1-3, plus some internal testing >> - [x] Manual testing with values known to previously cause undefined behavior >> >> Thanks! > > src/hotspot/share/opto/graphKit.cpp line 3807: > >> 3805: int log2_esize = Klass::layout_helper_log2_element_size(layout_con); >> 3806: fast_size_limit <<= (LogBytesPerLong - log2_esize); >> 3807: assert (fast_size_limit > 0, "increasing the size limit should not produce negative values"); > > Prior C++14 left shit producing a negative value is undefined behavior: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2161.pdf > > Do we compile c++ source specifying the C++ standard? Yes we use -std=c++14, but creating a negative value in this way still feels like a kind of overflow to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2152762023 From coleenp at openjdk.org Tue Jun 17 17:09:31 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Jun 2025 17:09:31 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 03:30:40 GMT, Ioi Lam wrote: > Background: when writing the string table in the AOT cache, we do this: > > 1. Find out the number of strings in the interned string table > 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. > 3. Enter safepoint > 4. Copy the strings into the arrays > > This bug happened because: > > - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` > - JIT compiler threads may create more interned strings after step 1 > > This PR attempts to fix both issues. I have a couple of comments. Overall, the approach seems good. src/hotspot/share/classfile/stringTable.cpp line 351: > 349: } > 350: > 351: size_t StringTable::items_count() { I think there's a convention to make accessor functions that use acquire semantics to be named items_count_acquire(). src/hotspot/share/classfile/stringTable.cpp line 970: > 968: // This flag will be cleared after intern table dumping has completed, so we can run the > 969: // compiler again (for future AOT method compilation, etc). > 970: DEBUG_ONLY(Atomic::release_store(&_disable_interning_during_cds_dump, 1)); I think atomics work with bool or is this a refcount ? src/hotspot/share/compiler/compileTask.cpp line 93: > 91: void CompileTask::wait_for_no_active_tasks() { > 92: MonitorLocker locker(CompileTaskAlloc_lock); > 93: while (_active_tasks > 0) { Doesn't this have to have an Atomic::load() to make it re-read in the loop? Even though it's after we reacquire the lock. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25816#pullrequestreview-2936227969 PR Review Comment: https://git.openjdk.org/jdk/pull/25816#discussion_r2152761032 PR Review Comment: https://git.openjdk.org/jdk/pull/25816#discussion_r2152619475 PR Review Comment: https://git.openjdk.org/jdk/pull/25816#discussion_r2152626132 From cslucas at openjdk.org Tue Jun 17 17:09:33 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 17 Jun 2025 17:09:33 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v7] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 16:29:17 GMT, Yudi Zheng wrote: >> Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods. > > Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge tag 'jdk-26+2' into JDK-8357424 > > Added tag jdk-26+2 for changeset d7aa3498 > - fix compilation error > - address comments > - Merge remote-tracking branch 'upstream/master' into JDK-8357424 > - address comments > - address comments > - update copyright > - [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod src/hotspot/share/runtime/deoptimization.cpp line 2367: > 2365: > 2366: #if INCLUDE_JVMCI > 2367: if (nm->is_jvmci_hosted()) { A comment here will probably be helpful as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25356#discussion_r2152765019 From iwalulya at openjdk.org Tue Jun 17 17:28:08 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 17 Jun 2025 17:28:08 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v3] In-Reply-To: References: Message-ID: > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge remote-tracking branch 'upstream/master' into G1HeapResizePolicyV2 - remove unrequired changes - kim - clean init ------------- Changes: https://git.openjdk.org/jdk/pull/25832/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=02 Stats: 576 lines in 16 files changed: 360 ins; 81 del; 135 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From kbarrett at openjdk.org Tue Jun 17 20:06:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 17 Jun 2025 20:06:28 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> Message-ID: <-M2-0zU6PnzfWIcFizGXyg9xeVGVrJVRvJVkHDy1-w0=.0702833c-7ed6-41bf-8c49-dc0bb2e1503b@github.com> On Tue, 17 Jun 2025 07:14:24 GMT, David Holmes wrote: >> src/hotspot/share/runtime/sharedRuntimeMath.hpp line 118: >> >>> 116: unsigned u_k = ((unsigned) k) + n; >>> 117: >>> 118: if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ >> >> I think `(unsigned)INT_MAX` would be more explicit about what's going on. >> This is also starting to push my limits on sufficiently simple to be a one-line `if`, and even more-so with my >> suggested change. >> I note that this isn't distinguishing between (1) `n > 0` and `k + n` overflows and wraps around to negative >> `int` vs (2) `n < 0` and `k + n` is negative. And that makes later code (both pre-existing and changed) >> harder to understand. I _think_ better here would be `u_k > 0x7fe && n > 0` => overflow, with some later >> adjustments. Then, if the test fails and we're not huge, `k = (int)u_k;` and use `k` as before, dropping >> `u_k`, so discarding the remainder of the currently proposed changes. > > I'm not seeing the full suggestion here. In the original code this line: > > if (k > 0x7fe) return hugeX*copysignA(hugeX,x); /* overflow */ > > is defining a logical overflow, not the wrapping overflow that we are trying to deal with. The wrapping overflow results in a negative value, which is the third case that gets handled. So AFAICS we need to use `u_k` all the way through until the end. Here is the diff for what I'm suggesting. diff --git a/src/hotspot/share/runtime/sharedRuntimeMath.hpp b/src/hotspot/share/runtime/sharedRuntimeMath.hpp index 91dda2a4fe8..321f3be580a 100644 --- a/src/hotspot/share/runtime/sharedRuntimeMath.hpp +++ b/src/hotspot/share/runtime/sharedRuntimeMath.hpp @@ -111,16 +111,23 @@ static double scalbnA(double x, int n) { if (n< -50000) return tiny*x; /*underflow*/ } if (k==0x7ff) return x+x; /* NaN or Inf */ - k = k+n; - if (k > 0x7fe) return hugeX*copysignA(hugeX,x); /* overflow */ + // If the high (sign) bit of u_k is set, then either + // n is positive and k+n would overflow, or + // n is negative and |n| > k. + unsigned u_k = (unsigned)k + (unsigned)n; + if ((u_k > 0x7fe) && (n > 0)) { + // Either (k+n > 0x7fe && k+n <= INT_MAX) or k+n would overflow. + return hugeX*copysignA(hugeX,x); /* overflow */ + } + // Set k to k+n, now that we know k+n wouldn't overflow. + // We know that k+n <= (int)0x7fe, and might be negative if n is negative. + k = (int)u_k; if (k > 0) { /* normal result */ set_high(&x, (hx&0x800fffff)|(k<<20)); return x; } if (k <= -54) { - if (n > 50000) /* in case integer overflow in n+k */ - return hugeX*copysignA(hugeX,x); /*overflow*/ - else return tiny*copysignA(tiny,x); /*underflow*/ + return tiny*copysignA(tiny,x); /*underflow*/ } k += 54; /* subnormal result */ set_high(&x, (hx&0x800fffff)|(k<<20)); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2153073234 From kbarrett at openjdk.org Tue Jun 17 20:42:30 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 17 Jun 2025 20:42:30 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: <_lJYLf0V0moLk_pQQscqC4IaHdIHMHObEFD13Uu9No8=.4b83e848-ecf7-40a8-a4ff-c12b5a978c16@github.com> On Tue, 17 Jun 2025 06:58:02 GMT, David Holmes wrote: > > The JBS issue also talks about `copysignA` and suggests we should just use `copysign` if we're keeping `scalbnA`. Please either address that here or file a new issue for `copysignA`. > > It doesn't really suggest that it simply says "is there a reason to prefer copysignA over the copysign? ". I don't have an answer to that any more than I can answer the scalbnA versus scalbn question. You need to a libmath expert to answer those types of questions. All I have tried to do here is address the UB that was spotted. OK, I'll be more direct than I was in JBS. We don't need copysignA. Just use copysign from . Rationale: Long ago we had our own copysign, because we couldn't get it from . It's a C99 function. For gcc/clang we were using C++98/03, which only includes C89 library functions. So gcc/clang version restricted it out. And MSVC++ didn't have it at all. Later, MSVC++ added copysign, without any version restriction since they didn't do Standard versions back then. This collided with ours, so we renamed ours. Later still we switched to C++14, which includes C99 library functions, so it's no longer version restricted by gcc/clang. So we no longer need our own, and should just use the one from . Whether this is done under this issue or a new one, I don't really care, so long as the issue isn't lost. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2981770338 From dlong at openjdk.org Tue Jun 17 20:55:29 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 17 Jun 2025 20:55:29 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: <-qfpN8-hyWv-QosNnOUvLaZtsI0Kr1vXsTIV6Tqvd-w=.badb50d8-3f05-41d1-bc90-d5939d6b571f@github.com> Message-ID: <87D1GHpnuO66YKnlxRh6JOlp7AZoZxhLBDbCpUG230A=.9119d110-62ba-4083-9a85-9bf78f5b462b@github.com> On Tue, 17 Jun 2025 12:50:18 GMT, Martin Doerr wrote: > > > Seems like arm32 has the same issue: > > > https://github.com/openjdk/jdk/blob/9d060574e5dbd13e634f00d749d0108ceff1fae8/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp#L199 > > > > > > The init value shouldn't have the sticky bit set. > > > > > > Thanks, I pushed a potential fix for that. > > Unfortunately, 0xBEAFDEAD also has the MSB set. Shouldn't we better use 0 like on all other platforms? Oops, I was tripped up trying to be clever. Yes, I'm fine with using 0. I'll fix it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2981800170 From dlong at openjdk.org Tue Jun 17 20:59:46 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 17 Jun 2025 20:59:46 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v8] In-Reply-To: References: Message-ID: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request incrementally with one additional commit since the last revision: 2nd try at arm fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25764/files - new: https://git.openjdk.org/jdk/pull/25764/files/3ac6dec0..3e306dde Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=06-07 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From kbarrett at openjdk.org Tue Jun 17 22:26:27 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 17 Jun 2025 22:26:27 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 07:48:03 GMT, David Holmes wrote: > Due to the testing problem... Why not write a gtest for scalnbA? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2982000211 From sspitsyn at openjdk.org Tue Jun 17 23:40:31 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 17 Jun 2025 23:40:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v4] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 13:59:54 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Use load_acquire only in the places that need it. This is really great optimization and refactoring! Looks pretty good to me but I'd like to make one more pass through the changes. src/hotspot/share/oops/instanceKlass.cpp line 2412: > 2410: } > 2411: > 2412: // Lookup or create a jmethodID Nit: Add dot at the end. src/hotspot/share/oops/instanceKlass.cpp line 2472: > 2470: // Allocate a larger one and copy entries to the new one. > 2471: // They've already been updated to point to new methods where applicable (i.e., not obsolete). > 2472: jmethodID* new_cache = create_jmethod_id_cache(size); Nice micro-refactoring! src/hotspot/share/oops/jmethodIDTable.hpp line 48: > 46: static void remove(jmethodID mid); > 47: > 48: // RedefineClasses support Nit: Add a dot at the end of the comment for consistency with other comments. src/hotspot/share/oops/method.hpp line 718: > 716: > 717: static void change_method_associated_with_jmethod_id(jmethodID old_jmid_ptr, Method* new_method); > 718: static bool validate_method_id(jmethodID mid); Nit: I'd suggest to name it `validate_jmethod_id` for consistency. ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2937302206 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2153333668 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2153332839 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2153323790 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2153315538 From sspitsyn at openjdk.org Tue Jun 17 23:40:32 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 17 Jun 2025 23:40:32 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 13:56:41 GMT, Axel Boldt-Christmas wrote: >> If we remove a jmethodID, we need to keep the number for it in case some JVMTI code still thinks that number is valid. So we can't decrement the entry count. > > That is not was I was trying to propose. What I tried to describe was this: > > ```c++ > // The value of the next jmethodID. This only increments (always unique IDs) > static uint64_t _jmethodID_counter = 0; > // Tracks the number of jmethodID entries in the _jmethod_id_table. > // Incremented on insert, decremented on remove. Use to track if we need to resize the table. > static uint64_t _jmethodID_entry_count = 0; > > > The problem with using `_jmethodID_counter` as a proxy for how many entries there are in the table is that it will diverge over time as we keep calling remove due to class unloading. > > Using a separate variable lets us resize based on what is actual in the table. Interesting suggestion to consider. I'm not sure yet if it is really important. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2153327605 From sspitsyn at openjdk.org Wed Jun 18 02:36:32 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 18 Jun 2025 02:36:32 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v4] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Mon, 9 Jun 2025 13:58:09 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move it to public It is nice update and refactoring in general. src/hotspot/share/oops/constantPool.cpp line 1953: > 1951: k1 = bsm_attribute_entry(idx1)->argument_index(j); > 1952: k2 = cp2->bsm_attribute_entry(idx2)->argument_index(j); > 1953: match = compare_entry_to(k1, cp2, k2); Nit: I'd suggest to define two locals to simplify the code as below: BSMAttributeEntry* e1 = bsm_attribute_entry(idx1); BSMAttributeEntry& e2 = cp2->bsm_attribute_entry(idx12); int k1 = e1->bootstrap_method_index(); int k2 = e2->bootstrap_method_index(); bool match = compare_entry_to(k1, cp2, k2); if (!match) { return false; } int argc = e1->argument_count(); if (argc == e2->argument_count()) { for (int j = 0; j < argc; j++) { k1 = e1->argument_index(j); k2 = e2->argument_index(j); match = compare_entry_to(k1, cp2, k2); src/hotspot/share/prims/jvmtiClassFileReconstituter.cpp line 29: > 27: #include "memory/universe.hpp" > 28: #include "oops/constantPool.hpp" > 29: #include "oops/constantPool.inline.hpp" The line 28 is not needed as we already have line 29. ------------- PR Review: https://git.openjdk.org/jdk/pull/25298#pullrequestreview-2937554075 PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2153510993 PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2153513749 From sspitsyn at openjdk.org Wed Jun 18 02:50:36 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 18 Jun 2025 02:50:36 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v4] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Mon, 9 Jun 2025 13:58:09 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move it to public src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 686: > 684: > 685: for (int i = 0; i < argc; i++) { > 686: u2 old_arg_ref_i = scratch_cp->bsm_attribute_entry(old_bs_i)->argument_index(i); Nit: Could you, consider below for a code simplification? : BSMAttributeEntry* old_bsme = scratch_cp->bsm_attribute_entry(old_bs_i); . . . 665 u2 old_ref_i = old_bsme->bootstrap_method_index(); . . . 679 u2 argc = old_bsme->argument_count(); . . . 686 u2 old_arg_ref_i = old_bsme->argument_index(i); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2153527548 From dholmes at openjdk.org Wed Jun 18 04:06:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 18 Jun 2025 04:06:31 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: <3RpiIC6hIeP82myY021TUbXBrSaSKrg2co92iYB26_U=.f8e7b72b-b835-47be-abd6-72f66370bf59@github.com> On Mon, 16 Jun 2025 12:04:12 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to add discussion of how >> we prefer to handle initialization and destruction of non-local variables. >> >> I propose this is an editorial change, as it just documents current practice >> rather than suggesting a change to current practice. As such, the normal >> HotSpot PR process applies. >> >> The updated .html file was generated using make update-build-docs. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > better terminology, merge separate sections Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25812#pullrequestreview-2937661372 From dholmes at openjdk.org Wed Jun 18 04:06:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 18 Jun 2025 04:06:32 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 16:52:54 GMT, Kim Barrett wrote: >> Sorry but I'm having trouble understanding how somehow could evaluate whether a dynamic initialization is okay based on the current statement. > > Does the initialization modify some other global state? Or does it depend on > some other global state that might not be knowable yet? Avoid those. Okay I think I get what you are saying. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25812#discussion_r2153589657 From dholmes at openjdk.org Wed Jun 18 04:19:27 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 18 Jun 2025 04:19:27 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 22:24:09 GMT, Kim Barrett wrote: > > Due to the testing problem... > > Why not write a gtest for scalnbA? Quite simply because I have no idea what the code does, nor how to effectively test it. Even exhaustive testing of inputs comparing the old code and new is not possible given we are dealing with a double input type and double return. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2982616036 From dholmes at openjdk.org Wed Jun 18 04:19:27 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 18 Jun 2025 04:19:27 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <_lJYLf0V0moLk_pQQscqC4IaHdIHMHObEFD13Uu9No8=.4b83e848-ecf7-40a8-a4ff-c12b5a978c16@github.com> References: <_lJYLf0V0moLk_pQQscqC4IaHdIHMHObEFD13Uu9No8=.4b83e848-ecf7-40a8-a4ff-c12b5a978c16@github.com> Message-ID: On Tue, 17 Jun 2025 20:39:29 GMT, Kim Barrett wrote: > We don't need copysignA. Just use copysign from . > > Rationale: Long ago we had our own copysign, because we couldn't get it from . Okay but do we know that what we have and what the math library provides are exactly the same? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2982617394 From dholmes at openjdk.org Wed Jun 18 04:24:33 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 18 Jun 2025 04:24:33 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <-M2-0zU6PnzfWIcFizGXyg9xeVGVrJVRvJVkHDy1-w0=.0702833c-7ed6-41bf-8c49-dc0bb2e1503b@github.com> References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> <-M2-0zU6PnzfWIcFizGXyg9xeVGVrJVRvJVkHDy1-w0=.0702833c-7ed6-41bf-8c49-dc0bb2e1503b@github.com> Message-ID: On Tue, 17 Jun 2025 20:04:18 GMT, Kim Barrett wrote: >> I'm not seeing the full suggestion here. In the original code this line: >> >> if (k > 0x7fe) return hugeX*copysignA(hugeX,x); /* overflow */ >> >> is defining a logical overflow, not the wrapping overflow that we are trying to deal with. The wrapping overflow results in a negative value, which is the third case that gets handled. So AFAICS we need to use `u_k` all the way through until the end. > > Here is the diff for what I'm suggesting. > > diff --git a/src/hotspot/share/runtime/sharedRuntimeMath.hpp b/src/hotspot/share/runtime/sharedRuntimeMath.hpp > index 91dda2a4fe8..321f3be580a 100644 > --- a/src/hotspot/share/runtime/sharedRuntimeMath.hpp > +++ b/src/hotspot/share/runtime/sharedRuntimeMath.hpp > @@ -111,16 +111,23 @@ static double scalbnA(double x, int n) { > if (n< -50000) return tiny*x; /*underflow*/ > } > if (k==0x7ff) return x+x; /* NaN or Inf */ > - k = k+n; > - if (k > 0x7fe) return hugeX*copysignA(hugeX,x); /* overflow */ > + // If the high (sign) bit of u_k is set, then either > + // n is positive and k+n would overflow, or > + // n is negative and |n| > k. > + unsigned u_k = (unsigned)k + (unsigned)n; > + if ((u_k > 0x7fe) && (n > 0)) { > + // Either (k+n > 0x7fe && k+n <= INT_MAX) or k+n would overflow. > + return hugeX*copysignA(hugeX,x); /* overflow */ > + } > + // Set k to k+n, now that we know k+n wouldn't overflow. > + // We know that k+n <= (int)0x7fe, and might be negative if n is negative. > + k = (int)u_k; > if (k > 0) { /* normal result */ > set_high(&x, (hx&0x800fffff)|(k<<20)); > return x; > } > if (k <= -54) { > - if (n > 50000) /* in case integer overflow in n+k */ > - return hugeX*copysignA(hugeX,x); /*overflow*/ > - else return tiny*copysignA(tiny,x); /*underflow*/ > + return tiny*copysignA(tiny,x); /*underflow*/ > } > k += 54; /* subnormal result */ > set_high(&x, (hx&0x800fffff)|(k<<20)); + if ((u_k > 0x7fe) && (n > 0)) { + // Either (k+n > 0x7fe && k+n <= INT_MAX) or k+n would overflow. + return hugeX*copysignA(hugeX,x); /* overflow */ + } But that is not correct - we only take this "overflow" path for 0x7fe < k+n < INT_MAX ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2153604667 From dholmes at openjdk.org Wed Jun 18 04:27:27 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 18 Jun 2025 04:27:27 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> <-M2-0zU6PnzfWIcFizGXyg9xeVGVrJVRvJVkHDy1-w0=.0702833c-7ed6-41bf-8c49-dc0bb2e1503b@github.com> Message-ID: On Wed, 18 Jun 2025 04:22:01 GMT, David Holmes wrote: >> Here is the diff for what I'm suggesting. >> >> diff --git a/src/hotspot/share/runtime/sharedRuntimeMath.hpp b/src/hotspot/share/runtime/sharedRuntimeMath.hpp >> index 91dda2a4fe8..321f3be580a 100644 >> --- a/src/hotspot/share/runtime/sharedRuntimeMath.hpp >> +++ b/src/hotspot/share/runtime/sharedRuntimeMath.hpp >> @@ -111,16 +111,23 @@ static double scalbnA(double x, int n) { >> if (n< -50000) return tiny*x; /*underflow*/ >> } >> if (k==0x7ff) return x+x; /* NaN or Inf */ >> - k = k+n; >> - if (k > 0x7fe) return hugeX*copysignA(hugeX,x); /* overflow */ >> + // If the high (sign) bit of u_k is set, then either >> + // n is positive and k+n would overflow, or >> + // n is negative and |n| > k. >> + unsigned u_k = (unsigned)k + (unsigned)n; >> + if ((u_k > 0x7fe) && (n > 0)) { >> + // Either (k+n > 0x7fe && k+n <= INT_MAX) or k+n would overflow. >> + return hugeX*copysignA(hugeX,x); /* overflow */ >> + } >> + // Set k to k+n, now that we know k+n wouldn't overflow. >> + // We know that k+n <= (int)0x7fe, and might be negative if n is negative. >> + k = (int)u_k; >> if (k > 0) { /* normal result */ >> set_high(&x, (hx&0x800fffff)|(k<<20)); >> return x; >> } >> if (k <= -54) { >> - if (n > 50000) /* in case integer overflow in n+k */ >> - return hugeX*copysignA(hugeX,x); /*overflow*/ >> - else return tiny*copysignA(tiny,x); /*underflow*/ >> + return tiny*copysignA(tiny,x); /*underflow*/ >> } >> k += 54; /* subnormal result */ >> set_high(&x, (hx&0x800fffff)|(k<<20)); > > + if ((u_k > 0x7fe) && (n > 0)) { > + // Either (k+n > 0x7fe && k+n <= INT_MAX) or k+n would overflow. > + return hugeX*copysignA(hugeX,x); /* overflow */ > + } > > But that is not correct - we only take this "overflow" path for 0x7fe < k+n < INT_MAX > `+ // We know that k+n <= (int)0x7fe, and might be negative if n is negative.` It can be negative if `n` is positive too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2153606924 From dholmes at openjdk.org Wed Jun 18 05:06:28 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 18 Jun 2025 05:06:28 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 15:54:25 GMT, Coleen Phillimore wrote: >> Background: when writing the string table in the AOT cache, we do this: >> >> 1. Find out the number of strings in the interned string table >> 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. >> 3. Enter safepoint >> 4. Copy the strings into the arrays >> >> This bug happened because: >> >> - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` >> - JIT compiler threads may create more interned strings after step 1 >> >> This PR attempts to fix both issues. > > src/hotspot/share/compiler/compileTask.cpp line 93: > >> 91: void CompileTask::wait_for_no_active_tasks() { >> 92: MonitorLocker locker(CompileTaskAlloc_lock); >> 93: while (_active_tasks > 0) { > > Doesn't this have to have an Atomic::load() to make it re-read in the loop? Even though it's after we reacquire the lock. Not if `_active_tasks` is only written whilst the same lock is held. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25816#discussion_r2153645613 From tschatzl at openjdk.org Wed Jun 18 06:48:38 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 18 Jun 2025 06:48:38 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v11] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 07:09:39 GMT, Kim Barrett wrote: >> Please review this change which adds a native method providing the >> implementation of Reference::get. Referece::get is an intrinsic candidate, so >> this native method implementation is only used when the intrinsic is not. >> >> Currently there is intrinsic support by the interpreter, C1, C2, and graal, >> which are always used. With this change we can later remove all the >> per-platform interpreter intrinsic implementations, and might also remove the >> C1 intrinsic implementation. >> >> Testing: >> (1) mach5 tier1-6 normal (so using all the existing intrinsics). >> (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: > > - Merge branch 'master' into native-reference-get > - add pseudo-native entry for Reference.get0 > - tidy CallGenerator lookup in Compile ctor > - fix comment alignment > - Merge branch 'master' into native-reference-get > - make private native Reference.get0 the intrinsic > - Merge branch 'master' into native-reference-get > - Merge branch 'master' into native-reference-get > - use new waitForRefProc, some tidying > - Merge branch 'master' into native-reference-get > - ... and 7 more: https://git.openjdk.org/jdk/compare/3be7ee90...877e64ca Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24315#pullrequestreview-2937932097 From bmaillard at openjdk.org Wed Jun 18 07:29:09 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Wed, 18 Jun 2025 07:29:09 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v2] In-Reply-To: References: Message-ID: > This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. > > ### Testing > - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) > - [x] tier1-3, plus some internal testing > - [x] Manual testing with values known to previously cause undefined behavior > > Thanks! Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: 8356865: Add comment to clarify the flag range ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25834/files - new: https://git.openjdk.org/jdk/pull/25834/files/486ddcb6..c8904a29 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25834&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25834&range=00-01 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25834.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25834/head:pull/25834 PR: https://git.openjdk.org/jdk/pull/25834 From bmaillard at openjdk.org Wed Jun 18 07:29:09 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Wed, 18 Jun 2025 07:29:09 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:36:39 GMT, Manuel H?ssig wrote: >> Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: >> >> 8356865: Add comment to clarify the flag range > > src/hotspot/share/runtime/globals.hpp line 1100: > >> 1098: /* Note: This value is zero mod 1<<13 for a cheap sparc set. */ \ >> 1099: "Inline allocations larger than this in doublewords must go slow")\ >> 1100: range(0, (1 << (BitsPerInt - LogBytesPerLong - 1)) - 1) \ > > It would be good to add a comment as to why this specific upper limit is necessary. Thanks, done! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2153848744 From bmaillard at openjdk.org Wed Jun 18 07:38:33 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Wed, 18 Jun 2025 07:38:33 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v2] In-Reply-To: References: Message-ID: <8nXpApdLxXidwKfFpcVbKjpYgOn5EfhUvKNQRKvv2o0=.252bc291-3219-4d77-9a4d-8fe75952c2f6@github.com> On Tue, 17 Jun 2025 17:04:25 GMT, Dean Long wrote: >> src/hotspot/share/opto/graphKit.cpp line 3807: >> >>> 3805: int log2_esize = Klass::layout_helper_log2_element_size(layout_con); >>> 3806: fast_size_limit <<= (LogBytesPerLong - log2_esize); >>> 3807: assert (fast_size_limit > 0, "increasing the size limit should not produce negative values"); >> >> Prior C++14 left shit producing a negative value is undefined behavior: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2161.pdf >> >> Do we compile c++ source specifying the C++ standard? > > Yes we use -std=c++14, but creating a negative value in this way still feels like a kind of overflow to me. Thanks for the comments! I added the assert because the issue in the JBS mentioned a specific case where we ended up with negative values. Should I leave it like this, or rather convert it to a more specific check (ie. making sure that the `LogBytesPerLong - log2_esize` most significant bits are not used **before** shifting)? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2153869915 From kbarrett at openjdk.org Wed Jun 18 08:19:27 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 08:19:27 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> <-M2-0zU6PnzfWIcFizGXyg9xeVGVrJVRvJVkHDy1-w0=.0702833c-7ed6-41bf-8c49-dc0bb2e1503b@github.com> Message-ID: <8_CoRfbkgRom4_8wkrguW6tldAsW1L3EZus3tchTElM=.3bddb374-5d98-4f7b-9d9c-5f413b3c23aa@github.com> On Wed, 18 Jun 2025 04:24:49 GMT, David Holmes wrote: >> + if ((u_k > 0x7fe) && (n > 0)) { >> + // Either (k+n > 0x7fe && k+n <= INT_MAX) or k+n would overflow. >> + return hugeX*copysignA(hugeX,x); /* overflow */ >> + } >> >> But that is not correct - we should only take this "overflow" path for `0x7fe < k+n <= INT_MAX`. Your suggestion makes us take this path if `k+n` overflows to negative. ?? > >> `+ // We know that k+n <= (int)0x7fe, and might be negative if n is negative.` > > It can be negative if `n` is positive too. > But that is not correct - we should only take this "overflow" path for > `((k+n) > 0x7fe && (k+n) <= INT_MAX)`. Your suggestion makes us take this > path if `(k+n)` overflows to negative. ?? It is intentional that the new test is true for the `(k+n)` => overflow case. It fully handles the overflow case, eliminating the need for the later fixup of the case where `((k <= -54) && (n > 5000))` (though `(n > 0)` would work just as well; I don't know why that `5000` was inserted). That fixup returned the same `hugeX`-based result as here. > It can be negative if n is positive too. `(k+n)` cannot be negative with `n` positive here, even under wrapping semantics, because we can't get here in that case due to the prior overflow detection. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2153961162 From mdoerr at openjdk.org Wed Jun 18 08:33:37 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 18 Jun 2025 08:33:37 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v8] In-Reply-To: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> References: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> Message-ID: On Tue, 17 Jun 2025 20:59:46 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > 2nd try at arm fix Recent changes look good. Thanks! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2938255719 From kbarrett at openjdk.org Wed Jun 18 08:33:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 08:33:38 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <_lJYLf0V0moLk_pQQscqC4IaHdIHMHObEFD13Uu9No8=.4b83e848-ecf7-40a8-a4ff-c12b5a978c16@github.com> Message-ID: On Wed, 18 Jun 2025 04:16:32 GMT, David Holmes wrote: > > We don't need copysignA. Just use copysign from . > > Rationale: Long ago we had our own copysign, because we couldn't get it from > > . > > Okay but do we know that what we have and what the math library provides are exactly the same? Yes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25656#issuecomment-2983219926 From ayang at openjdk.org Wed Jun 18 09:00:58 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 18 Jun 2025 09:00:58 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v13] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <76NgHH-m26Nw2paJmIQvNNqio_iKtdQ_bJ2aejMfKEI=.82ff25aa-f5c5-4146-84b5-1aaaefb5efd1@github.com> > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - review - Merge branch 'master' into pgc-size-policy - merge - version - Merge branch 'master' into pgc-size-policy - revert-aliases - Merge branch 'master' into pgc-size-policy - merge - merge-fix - merge - ... and 9 more: https://git.openjdk.org/jdk/compare/2b94b70e...a21e5363 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=12 Stats: 4371 lines in 31 files changed: 520 ins; 3452 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From ayang at openjdk.org Wed Jun 18 09:01:00 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 18 Jun 2025 09:01:00 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v3] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: On Sun, 18 May 2025 15:36:03 GMT, Guoxiong Li wrote: >> Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - review >> - Merge branch 'master' into pgc-size-policy >> - review >> - Merge branch 'master' into pgc-size-policy >> - pgc-size-policy > > src/hotspot/share/gc/parallel/psYoungGen.cpp line 268: > >> 266: size_t original_committed_size = virtual_space()->committed_size(); >> 267: >> 268: while (true) { > > The `while` statement only runs once. May we find a better way to refactor the code? @lgxbslgx I did some restructuring based on some offline discussion -- hopefully, the new style is more readable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2154053199 From azafari at openjdk.org Wed Jun 18 09:04:36 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 18 Jun 2025 09:04:36 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v42] In-Reply-To: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: > - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. > - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. > - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. > - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: copyright years ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20425/files - new: https://git.openjdk.org/jdk/pull/20425/files/815092d2..ac151586 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20425&range=41 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20425&range=40-41 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/20425.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20425/head:pull/20425 PR: https://git.openjdk.org/jdk/pull/20425 From jsjolen at openjdk.org Wed Jun 18 09:10:49 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 18 Jun 2025 09:10:49 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v42] In-Reply-To: References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: On Wed, 18 Jun 2025 09:04:36 GMT, Afshin Zafari wrote: >> - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. >> - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. >> - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. >> - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > copyright years Marked as reviewed by jsjolen (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20425#pullrequestreview-2938381625 From azafari at openjdk.org Wed Jun 18 09:10:51 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 18 Jun 2025 09:10:51 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v41] In-Reply-To: References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> <000AGCwbLZRbmxAGfatjpOnEFz_Ym2fMmDBaHgvV96w=.6b3c0c5c-0480-4036-8052-1b7ba8dabfd3@github.com> Message-ID: <00eARElwnxkitXuqWa5uDdul1WafVU6fUGcmnIOJIV4=.7d9fc71c-be40-468d-8f14-adb98f4e3eb6@github.com> On Tue, 17 Jun 2025 16:37:22 GMT, Gerard Ziemski wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixes to a few failures. > > src/hotspot/share/nmt/regionsTree.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. > > Copyright year Done. > src/hotspot/share/nmt/regionsTree.hpp line 2: > >> 1: /* >> 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. > > Copyright year Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20425#discussion_r2154077488 PR Review Comment: https://git.openjdk.org/jdk/pull/20425#discussion_r2154077012 From azafari at openjdk.org Wed Jun 18 09:14:47 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 18 Jun 2025 09:14:47 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v41] In-Reply-To: References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> <000AGCwbLZRbmxAGfatjpOnEFz_Ym2fMmDBaHgvV96w=.6b3c0c5c-0480-4036-8052-1b7ba8dabfd3@github.com> Message-ID: On Tue, 17 Jun 2025 15:53:04 GMT, Gerard Ziemski wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixes to a few failures. > > test/hotspot/gtest/runtime/test_virtualMemoryTracker.cpp line 259: > >> 257: static void test_add_committed_region_overlapping() { >> 258: RegionsTree* rtree = VirtualMemoryTracker::Instance::tree(); >> 259: rtree->tree().remove_all(); > > Why are we calling `remove_all()` right after we create the tree? We are not creating the tree here. We just retrieve it from the Instance of VMT. Since the tree is also visible to other tests (it is static and not created per test), any changes in other tests will be visible here by this test. This is not as expected in the design of the tests (tests assume that tree is empty at the beginning of the test). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20425#discussion_r2154087590 From azafari at openjdk.org Wed Jun 18 09:20:09 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 18 Jun 2025 09:20:09 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v43] In-Reply-To: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: > - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. > - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. > - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. > - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. Afshin Zafari has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 89 commits: - Merge remote-tracking branch 'origin/master' into _8337217_nmt_VMT_with_tree - copyright years - fixes to a few failures. - changes after merge - Merge remote-tracking branch 'origin/master' into _8337217_nmt_VMT_with_tree - fixes after merge with master. - Merge remote-tracking branch 'origin/master' into _8337217_nmt_VMT_with_tree - Merge remote-tracking branch 'origin/master' into _8337217_nmt_VMT_with_tree - more reviews. - review comments applied - ... and 79 more: https://git.openjdk.org/jdk/compare/2b94b70e...66ddc675 ------------- Changes: https://git.openjdk.org/jdk/pull/20425/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20425&range=42 Stats: 1459 lines in 27 files changed: 571 ins; 567 del; 321 mod Patch: https://git.openjdk.org/jdk/pull/20425.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20425/head:pull/20425 PR: https://git.openjdk.org/jdk/pull/20425 From stuefe at openjdk.org Wed Jun 18 09:25:34 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 18 Jun 2025 09:25:34 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v11] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 09:15:51 GMT, Anton Artemov wrote: > Okay, let me summarize the findings: > > > > 1) One needs to be consistent with respect to the type of the return value, of `os::xxx()` and having `size_t` in all methods is something everyone agrees on. > > 2) There is a consensus that errors should be reported and handled properly, not ignored. > > 3) There is only one type of error. > > 4) The usage pattern should make it difficult for the user to ignore the error if it is reported. > > > > Given the above, I think the optimal solution is to have a boolean as a return type to indicate the error (true for no error, false for error), and the actual memory value as an in-parameter transferred by reference or pointer. The usage pattern may be enforced by `[[nodiscard]]` attribute, but it available from C++17 only. For now, one can just add if/else statements, and add the attribute later after upgrade to C++17. +1 on all points. Thanks a lot? for doing this, I know bikesheddy issues like this can be a real slog. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25450#issuecomment-2983387064 From mhaessig at openjdk.org Wed Jun 18 10:36:29 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 18 Jun 2025 10:36:29 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 18:46:58 GMT, Kim Barrett wrote: >> Please review this change that makes the various code cache/heap size options >> consistently be of type size_t. >> >> The shared declarations for these options were all uintx. These options all >> may have platform-defined values. Some of those platform-specific definitions >> were uintx, some were size_t, and some were intx(!). This change makes them >> all consistently size_t. >> >> More details in the first comment. >> >> Testing: mach5 tier1-6 >> GHA testing in-progress > > Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: > > - update copyrights > - remove leftover include Thank you for addressing the comments. Looks good to me! ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/25791#pullrequestreview-2938683405 From duke at openjdk.org Wed Jun 18 10:54:31 2025 From: duke at openjdk.org (Manjunath S Matti.) Date: Wed, 18 Jun 2025 10:54:31 GMT Subject: RFR: 8359114: [s390x] Add z17 detection code [v2] In-Reply-To: <_fiM-Nhm3q5S2hCxa3quxpodBRmeIsCIBcA7AB4Hmcc=.2005b23d-fc05-4821-90b4-cf22a8d2442e@github.com> References: <_fiM-Nhm3q5S2hCxa3quxpodBRmeIsCIBcA7AB4Hmcc=.2005b23d-fc05-4821-90b4-cf22a8d2442e@github.com> Message-ID: On Wed, 11 Jun 2025 13:09:45 GMT, Manjunath S Matti. wrote: >> Add support to detect the new generation of Z machine (z17). > > Manjunath S Matti. has updated the pull request incrementally with one additional commit since the last revision: > > Correct the comments for the bits covered in DW[2] and DW[3]. @theRealAph could you please provide the much needed 2nd review? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25718#issuecomment-2983706034 From tschatzl at openjdk.org Wed Jun 18 11:13:28 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 18 Jun 2025 11:13:28 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v2] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 18:46:58 GMT, Kim Barrett wrote: >> Please review this change that makes the various code cache/heap size options >> consistently be of type size_t. >> >> The shared declarations for these options were all uintx. These options all >> may have platform-defined values. Some of those platform-specific definitions >> were uintx, some were size_t, and some were intx(!). This change makes them >> all consistently size_t. >> >> More details in the first comment. >> >> Testing: mach5 tier1-6 >> GHA testing in-progress > > Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: > > - update copyrights > - remove leftover include Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25791#pullrequestreview-2938784260 From eosterlund at openjdk.org Wed Jun 18 11:36:30 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 18 Jun 2025 11:36:30 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 00:02:29 GMT, Dean Long wrote: >> src/hotspot/share/gc/z/zBarrierSetNMethod.cpp line 114: >> >>> 112: // Preserve the sticky bit >>> 113: if (is_not_entrant(nm)) { >>> 114: value |= not_entrant; >> >> Is it possible to have a race where another thread sets an nmethod to not entrant and the thread calling this making the nmethod entry barrier not entrant? >> >> If this was called to disarm a method and then enter it, it seems a bit sneaky in that case that we pass the nmethod entry barrier even though we under the lock see that it is not entrant. Probably okay but still feels like it might be more robust if the thread setting an nmethod to not entrant is always the one that arms the nmethod entry barrier. > > If I understand your concern correctly, there is no race. The only caller of BarrierSetNMethod::make_not_entrant() is nmethod::make_not_entrant(), and it is done inside a NMethodState_lock critical section. After a call to nmethod::make_not_entrant(), the nmethod entry barrier is armed and stays that way. > And by design, a disarm only disarms at the inner nmethod_entry_barrier level, not the outer nmethod_stub_entry_barrier level. My concern is that while thread 1 calls nmethod::make_not_entrant(), thread 2 racingly performs nmethod entry barrier; it makes the is_not_entrant check before it gets updated, but then it gets updated as the per nmethod lock is taken. The GC code "disarms" the GC barrier but in doing so finds that "oh this should be not entrant", but that's sort of not reflected as thread 2 will then proceed with entering the nmethod it just armed as not entrant in the nmethod entry barrier code. Does that make sense? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2154370247 From azafari at openjdk.org Wed Jun 18 11:40:54 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 18 Jun 2025 11:40:54 GMT Subject: RFR: 8337217: Port VirtualMemoryTracker to use VMATree [v31] In-Reply-To: References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: On Tue, 25 Feb 2025 15:40:54 GMT, Gerard Ziemski wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> reviews applied. > > How would I go about verifying the performance gain? You mentioned previously that you wrote a microbenchmark for testing this? Thanks for reviews @gerard-ziemski and @jdksjolen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20425#issuecomment-2983833167 From azafari at openjdk.org Wed Jun 18 11:40:56 2025 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 18 Jun 2025 11:40:56 GMT Subject: Integrated: 8337217: Port VirtualMemoryTracker to use VMATree In-Reply-To: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> References: <_QgAec-LQq4pdC6sP3UAZLHRT30q1mxXohvGDag1a6U=.214e9d81-c627-4f34-af8f-cb71506eeda2@github.com> Message-ID: On Thu, 1 Aug 2024 15:44:32 GMT, Afshin Zafari wrote: > - `VMATree` is used instead of `SortedLinkList` in new class `VirtualMemoryTracker`. > - A wrapper/helper `RegionTree` is made around VMATree to make some calls easier. > - `find_reserved_region()` is used in 4 cases, it will be removed in further PRs. > - All tier1 tests pass except this https://bugs.openjdk.org/browse/JDK-8335167. This pull request has now been integrated. Changeset: 547ce030 Author: Afshin Zafari URL: https://git.openjdk.org/jdk/commit/547ce0301684fdebe95ce2e8e195a019bcefe493 Stats: 1459 lines in 27 files changed: 571 ins; 567 del; 321 mod 8337217: Port VirtualMemoryTracker to use VMATree Reviewed-by: jsjolen, gziemski ------------- PR: https://git.openjdk.org/jdk/pull/20425 From ayang at openjdk.org Wed Jun 18 11:49:36 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 18 Jun 2025 11:49:36 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v3] In-Reply-To: References: Message-ID: <3MBZ11WVojPRr6vcJxxaepOi6mO0GjF_j9gLGkU9jTI=.749e7301-2e8d-4d32-8f4e-887b8997a6b2@github.com> On Tue, 17 Jun 2025 17:28:08 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge remote-tracking branch 'upstream/master' into G1HeapResizePolicyV2 > - remove unrequired changes - kim > - clean init src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 887: > 885: } else { > 886: shrink(resize_bytes); > 887: uncommit_regions_if_necessary(); I wonder if it's more symmetric if this uncommit logic is inlined to `shrink`. src/hotspot/share/gc/g1/g1_globals.hpp line 170: > 168: product(size_t, G1MinimumPercentOfGCTimeRatio, 25, EXPERIMENTAL, \ > 169: "Percentage of GCTimeRatio G1 will try to avoid going below.") \ > 170: range(0, 100) \ If they are in [0,100], maybe `uint` is enough? src/hotspot/share/utilities/numberSeq.cpp line 151: > 149: AbsSeq(alpha), _length(length), _next(0) { > 150: _sequence = NEW_C_HEAP_ARRAY(double, _length, mtInternal); > 151: TruncatedSeq::reset(); Not sure why a new method (`reset`) is needed inside this PR. src/hotspot/share/utilities/numberSeq.hpp line 50: > 48: > 49: protected: > 50: uint _num; // the number of elements in the sequence All `int` -> `uint` changes in this class/file are good; can they be extracted out to its own PR/ticket? src/hotspot/share/utilities/numberSeq.hpp line 132: > 130: > 131: virtual void reset(); > 132: bool is_full() const { return _length == _num; } Seems unused. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2154373368 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2154347417 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2154395621 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2154350585 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2154388484 From mablakatov at openjdk.org Wed Jun 18 11:51:37 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 18 Jun 2025 11:51:37 GMT Subject: Integrated: 8358329: AArch64: emit direct branches in static stubs for small code caches In-Reply-To: References: Message-ID: On Mon, 9 Jun 2025 19:17:53 GMT, Mikhail Ablakatov wrote: > In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. > > This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. > > Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: > > | Metric | Before | After | Difference | > |-------------|---------------|---------------|------------| > | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | > | | Sum: 6653848 | Sum: 6616344 | -0.56% | > | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | > | | Sum: 364376 | Sum: 308552 | -15.33% | > > Full jtreg passed on AArch64. This pull request has now been integrated. Changeset: ba32b78b Author: Mikhail Ablakatov Committer: Evgeny Astigeevich URL: https://git.openjdk.org/jdk/commit/ba32b78bfaf83f69003f83333ab6975b35343fde Stats: 173 lines in 5 files changed: 154 ins; 13 del; 6 mod 8358329: AArch64: emit direct branches in static stubs for small code caches Reviewed-by: aph, eastigeevich ------------- PR: https://git.openjdk.org/jdk/pull/25702 From coleenp at openjdk.org Wed Jun 18 12:06:23 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 12:06:23 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v5] In-Reply-To: References: Message-ID: <004wleJ3yHgoUNrbtM7k6dGPoQ3olVfohABbP_4OvdA=.99779359-8024-4fe0-8563-9172acb1d991@github.com> > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Serguei's comments rename validate_jmethod_id ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/41d49607..333d344c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=03-04 Stats: 4 lines in 4 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From coleenp at openjdk.org Wed Jun 18 12:06:26 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 12:06:26 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v4] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 23:34:42 GMT, Serguei Spitsyn wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Use load_acquire only in the places that need it. > > src/hotspot/share/oops/instanceKlass.cpp line 2412: > >> 2410: } >> 2411: >> 2412: // Lookup or create a jmethodID > > Nit: Add dot at the end. fixed. > src/hotspot/share/oops/jmethodIDTable.hpp line 48: > >> 46: static void remove(jmethodID mid); >> 47: >> 48: // RedefineClasses support > > Nit: Add a dot at the end of the comment for consistency with other comments. This isn't a sentence and this pattern is in the sources in a lot of places, none of these places have a period at the end. > src/hotspot/share/oops/method.hpp line 718: > >> 716: >> 717: static void change_method_associated_with_jmethod_id(jmethodID old_jmid_ptr, Method* new_method); >> 718: static bool validate_method_id(jmethodID mid); > > Nit: I'd suggest to name it `validate_jmethod_id` for consistency. Good suggestion - these names get confusing without the 'j'. I tried to make them consistent. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2154417361 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2154420659 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2154422068 From coleenp at openjdk.org Wed Jun 18 12:06:26 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 12:06:26 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v2] In-Reply-To: References: Message-ID: <5Uzay5zSlbA4oXBvXrEFMymaxuHo6SFx6U55Pz3Tnoo=.bacb52e9-0913-4346-938c-d511e1b4b47b@github.com> On Tue, 17 Jun 2025 23:28:29 GMT, Serguei Spitsyn wrote: >> That is not was I was trying to propose. What I tried to describe was this: >> >> ```c++ >> // The value of the next jmethodID. This only increments (always unique IDs) >> static uint64_t _jmethodID_counter = 0; >> // Tracks the number of jmethodID entries in the _jmethod_id_table. >> // Incremented on insert, decremented on remove. Use to track if we need to resize the table. >> static uint64_t _jmethodID_entry_count = 0; >> >> >> The problem with using `_jmethodID_counter` as a proxy for how many entries there are in the table is that it will diverge over time as we keep calling remove due to class unloading. >> >> Using a separate variable lets us resize based on what is actual in the table. > > Interesting suggestion to consider. I'm not sure yet if it is really important. This makes sense because we want to know how many items are in the table vs. how many items have been in the table. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2154424321 From coleenp at openjdk.org Wed Jun 18 12:06:27 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 12:06:27 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v5] In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 20:18:38 GMT, Daniel D. Daugherty wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Serguei's comments rename validate_jmethod_id > > src/hotspot/share/oops/jmethodIDTable.hpp line 31: > >> 29: #include "memory/allocation.hpp" >> 30: >> 31: // Class for associating Method with jmethodID > > nit typo: please add an ending period to the comment. fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2154419139 From coleenp at openjdk.org Wed Jun 18 12:13:54 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 12:13:54 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v6] In-Reply-To: References: Message-ID: <0nLuoxUlhYAgg2cRGC1EpOp_HqWsg5SH11Uj1RmOMhE=.865999e3-c1a1-4369-8772-b430ee16cd59@github.com> > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add entry count for jmethodID table. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/333d344c..fd9bf502 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=04-05 Stats: 9 lines in 1 file changed: 7 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From coleenp at openjdk.org Wed Jun 18 12:18:49 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 12:18:49 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps Message-ID: This uses names for frame types for stackmaps in the verifier and redefinition. Tested with tier1-7. ------------- Commit messages: - 8359920: Use names for frame types in stackmaps Changes: https://git.openjdk.org/jdk/pull/25870/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359920 Stats: 23 lines in 3 files changed: 8 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/25870.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25870/head:pull/25870 PR: https://git.openjdk.org/jdk/pull/25870 From jsjolen at openjdk.org Wed Jun 18 12:24:31 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 18 Jun 2025 12:24:31 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:13:38 GMT, Coleen Phillimore wrote: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. Marked as reviewed by jsjolen (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25870#pullrequestreview-2938988022 From jsjolen at openjdk.org Wed Jun 18 12:27:31 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 18 Jun 2025 12:27:31 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v4] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Wed, 18 Jun 2025 02:28:32 GMT, Serguei Spitsyn wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Move it to public > > src/hotspot/share/oops/constantPool.cpp line 1953: > >> 1951: k1 = bsm_attribute_entry(idx1)->argument_index(j); >> 1952: k2 = cp2->bsm_attribute_entry(idx2)->argument_index(j); >> 1953: match = compare_entry_to(k1, cp2, k2); > > Nit: I'd suggest to define two locals to simplify the code as below: > > BSMAttributeEntry* e1 = bsm_attribute_entry(idx1); > BSMAttributeEntry& e2 = cp2->bsm_attribute_entry(idx12); > > int k1 = e1->bootstrap_method_index(); > int k2 = e2->bootstrap_method_index(); > bool match = compare_entry_to(k1, cp2, k2); > > if (!match) { > return false; > } > int argc = e1->argument_count(); > if (argc == e2->argument_count()) { > for (int j = 0; j < argc; j++) { > k1 = e1->argument_index(j); > k2 = e2->argument_index(j); > match = compare_entry_to(k1, cp2, k2); Yeah, that's a good simplification. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2154466594 From jsjolen at openjdk.org Wed Jun 18 12:37:07 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 18 Jun 2025 12:37:07 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v5] In-Reply-To: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: <4uEo1C_csdAppvNYFqU1JtgADY-m4as2wBIq1kVq_GA=.6125ab99-8b92-410c-9257-852d8e7eb47e@github.com> > Hi, > > The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: > > ```c++ > struct BSMAE { > u2 bootstrap_method_index; > u2 argument_count; > u2 arguments[argument_count]; > } > > > We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. > > Please consider! > > Testing: Currently GHA, running tier1-tier3 Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Matias's comments - Apply Sergei's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25298/files - new: https://git.openjdk.org/jdk/pull/25298/files/1c7484d7..af3caa9b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=03-04 Stats: 17 lines in 4 files changed: 6 ins; 1 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25298/head:pull/25298 PR: https://git.openjdk.org/jdk/pull/25298 From jsjolen at openjdk.org Wed Jun 18 12:37:07 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 18 Jun 2025 12:37:07 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v4] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Mon, 9 Jun 2025 13:58:09 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move it to public Thank you for the reviews! I applied your comments, will await GHA before integration (also need a re-approval). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25298#issuecomment-2984018108 From coleenp at openjdk.org Wed Jun 18 12:46:56 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 12:46:56 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add a basic gtest. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/fd9bf502..8339a6b5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=05-06 Stats: 63 lines in 3 files changed: 63 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From mablakatov at openjdk.org Wed Jun 18 13:20:35 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 18 Jun 2025 13:20:35 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: <_mCh8L-9aT5OkbrlrMTWY7cwlzLGfjb1tA310PNC--8=.cf36e768-d129-4519-8f81-1c1660bfef61@github.com> References: <_mCh8L-9aT5OkbrlrMTWY7cwlzLGfjb1tA310PNC--8=.cf36e768-d129-4519-8f81-1c1660bfef61@github.com> Message-ID: On Thu, 12 Jun 2025 15:43:43 GMT, Andrew Haley wrote: >> Thanks. > >> The error in java/lang/Thread/virtual/stress/GetStackTraceALotWhenBlocking.java#id0 looks similar to what has been previously reported here: https://bugs.openjdk.org/browse/JDK-8344577 . @theRealAph , do you think the patch may cause the error? Or should I open a similar JBS ticket to report it? > > That bug is macOS/x86. So, is the failure you're seeing repeatable? Sorry @theRealAph , I've re-requested a review by mistake. Please ignore it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25702#issuecomment-2984183502 From jsikstro at openjdk.org Wed Jun 18 13:34:05 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 18 Jun 2025 13:34:05 GMT Subject: RFR: 8359923: Const accessors for the Deferred class Message-ID: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> Hello, This RFE adds const accessors to the `Deferred` class. We plan on using this in a future patch in ZGC. Thanks! Testing: * Currently running tier1-2 * Works in an WIP ZGC patch ------------- Commit messages: - 8359923: Const accessors for the Deferred class Changes: https://git.openjdk.org/jdk/pull/25874/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25874&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359923 Stats: 13 lines in 1 file changed: 13 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25874.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25874/head:pull/25874 PR: https://git.openjdk.org/jdk/pull/25874 From jsjolen at openjdk.org Wed Jun 18 13:57:30 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 18 Jun 2025 13:57:30 GMT Subject: RFR: 8359923: Const accessors for the Deferred class In-Reply-To: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> References: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> Message-ID: On Wed, 18 Jun 2025 13:28:43 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE adds const accessors to the `Deferred` class. We plan on using this in a future patch in ZGC. Thanks! > > Testing: > * Currently running tier1-2 > * Works in an WIP ZGC patch LGTM and trivial ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25874#pullrequestreview-2939345058 From jsikstro at openjdk.org Wed Jun 18 14:10:34 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 18 Jun 2025 14:10:34 GMT Subject: Integrated: 8359923: Const accessors for the Deferred class In-Reply-To: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> References: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> Message-ID: On Wed, 18 Jun 2025 13:28:43 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE adds const accessors to the `Deferred` class. We plan on using this in a future patch in ZGC. Thanks! > > Testing: > * Currently running tier1-2 > * Works in an WIP ZGC patch This pull request has now been integrated. Changeset: 42d3604a Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/42d3604a31c4e5b5391468ee1d2c88c23c54c1d9 Stats: 13 lines in 1 file changed: 13 ins; 0 del; 0 mod 8359923: Const accessors for the Deferred class Reviewed-by: jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/25874 From ayang at openjdk.org Wed Jun 18 14:11:05 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 18 Jun 2025 14:11:05 GMT Subject: RFR: 8359924: Deprecate and obsolete ParallelRefProcEnabled Message-ID: Deprecating `ParallelRefProcEnabled`, which is used only by Parallel and G1, and both have it enabled by default via: if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > 1) { FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); } Disabling it offers little benefit and its presence incurs some implementation complexity in the reference-processor. ------------- Commit messages: - deprecate-flag Changes: https://git.openjdk.org/jdk/pull/25875/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25875&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359924 Stats: 19 lines in 3 files changed: 10 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25875.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25875/head:pull/25875 PR: https://git.openjdk.org/jdk/pull/25875 From jsikstro at openjdk.org Wed Jun 18 14:10:34 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 18 Jun 2025 14:10:34 GMT Subject: RFR: 8359923: Const accessors for the Deferred class In-Reply-To: References: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> Message-ID: On Wed, 18 Jun 2025 13:54:44 GMT, Johan Sj?len wrote: >> Hello, >> >> This RFE adds const accessors to the `Deferred` class. We plan on using this in a future patch in ZGC. Thanks! >> >> Testing: >> * Currently running tier1-2 >> * Works in an WIP ZGC patch > > LGTM and trivial Thank you for the quick review @jdksjolen! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25874#issuecomment-2984370090 From tschatzl at openjdk.org Wed Jun 18 14:49:32 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 18 Jun 2025 14:49:32 GMT Subject: RFR: 8359924: Deprecate and obsolete ParallelRefProcEnabled In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:04:28 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcEnabled`, which is used only by Parallel and G1, and both have it enabled by default via: > > > if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > 1) { > FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); > } > > > Disabling it offers little benefit and its presence incurs some implementation complexity in the reference-processor. I'm wondering whether we should make `ParallelRefProcBalancingEnabled` diagnostic at the same time? (Separately). Same with `ReferencesPerThread`, again separately, but that does not need a CSR. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25875#pullrequestreview-2939558285 From coleenp at openjdk.org Wed Jun 18 14:53:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Jun 2025 14:53:55 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields Message-ID: Hi all, This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. This has been running cleanly in CI for a week now. Thanks! ------------- Commit messages: - Backport e18277b470a162b9668297e8e286c812c4b0b604 Changes: https://git.openjdk.org/jdk/pull/25877/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25877&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352075 Stats: 924 lines in 18 files changed: 854 ins; 20 del; 50 mod Patch: https://git.openjdk.org/jdk/pull/25877.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25877/head:pull/25877 PR: https://git.openjdk.org/jdk/pull/25877 From rvansa at openjdk.org Wed Jun 18 15:44:28 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 18 Jun 2025 15:44:28 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! Thank you for the backport! @shipilev indicated that the backport to 21 should wait a bit, could you clarify when should I file that (e.g. end of July, ...)? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-2984754129 From kbarrett at openjdk.org Wed Jun 18 15:52:09 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 15:52:09 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v3] In-Reply-To: References: Message-ID: > Please review this change that makes the various code cache/heap size options > consistently be of type size_t. > > The shared declarations for these options were all uintx. These options all > may have platform-defined values. Some of those platform-specific definitions > were uintx, some were size_t, and some were intx(!). This change makes them > all consistently size_t. > > More details in the first comment. > > Testing: mach5 tier1-6 > GHA testing in-progress Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge branch 'master' into code-cache-sizes - update copyrights - remove leftover include - fix whitebox access to code cache size configs - VMPageSizeConstraintFunc - CodeCacheMinBlockLength - CodeCacheExpansionSize - various CodeHeapSize options - CodeCacheMinimumUseSpace - Initial/ReservedCodeCacheSize - ... and 1 more: https://git.openjdk.org/jdk/compare/ba32b78b...78f1f7b9 ------------- Changes: https://git.openjdk.org/jdk/pull/25791/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25791&range=02 Stats: 190 lines in 40 files changed: 1 ins; 0 del; 189 mod Patch: https://git.openjdk.org/jdk/pull/25791.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25791/head:pull/25791 PR: https://git.openjdk.org/jdk/pull/25791 From kbarrett at openjdk.org Wed Jun 18 15:55:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 15:55:34 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v3] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 15:52:09 GMT, Kim Barrett wrote: >> Please review this change that makes the various code cache/heap size options >> consistently be of type size_t. >> >> The shared declarations for these options were all uintx. These options all >> may have platform-defined values. Some of those platform-specific definitions >> were uintx, some were size_t, and some were intx(!). This change makes them >> all consistently size_t. >> >> More details in the first comment. >> >> Testing: mach5 tier1-6 >> GHA testing in-progress > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'master' into code-cache-sizes > - update copyrights > - remove leftover include > - fix whitebox access to code cache size configs > - VMPageSizeConstraintFunc > - CodeCacheMinBlockLength > - CodeCacheExpansionSize > - various CodeHeapSize options > - CodeCacheMinimumUseSpace > - Initial/ReservedCodeCacheSize > - ... and 1 more: https://git.openjdk.org/jdk/compare/ba32b78b...78f1f7b9 Sigh, there are new merge conflicts after the one I just pushed a fix for. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25791#issuecomment-2984799914 From kbarrett at openjdk.org Wed Jun 18 17:05:48 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 17:05:48 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v4] In-Reply-To: References: Message-ID: <1TwILxUixcdyTvD3KiI_tZwLiFAVHknRWlab7zxntxE=.0c1fad26-0980-43fc-9125-de686a9ad196@github.com> > Please review this change that makes the various code cache/heap size options > consistently be of type size_t. > > The shared declarations for these options were all uintx. These options all > may have platform-defined values. Some of those platform-specific definitions > were uintx, some were size_t, and some were intx(!). This change makes them > all consistently size_t. > > More details in the first comment. > > Testing: mach5 tier1-6 > GHA testing in-progress Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge branch 'master' into code-cache-sizes - Merge branch 'master' into code-cache-sizes - update copyrights - remove leftover include - fix whitebox access to code cache size configs - VMPageSizeConstraintFunc - CodeCacheMinBlockLength - CodeCacheExpansionSize - various CodeHeapSize options - CodeCacheMinimumUseSpace - ... and 2 more: https://git.openjdk.org/jdk/compare/984d7f9c...56471cc9 ------------- Changes: https://git.openjdk.org/jdk/pull/25791/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25791&range=03 Stats: 190 lines in 40 files changed: 1 ins; 0 del; 189 mod Patch: https://git.openjdk.org/jdk/pull/25791.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25791/head:pull/25791 PR: https://git.openjdk.org/jdk/pull/25791 From kbarrett at openjdk.org Wed Jun 18 17:05:49 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 17:05:49 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v3] In-Reply-To: References: Message-ID: <1Xar9J1Key5fMJQYreKsIGnifrHsiXdCRN8p4SyLE60=.8c7accda-34c5-424d-9b75-16f0538e95c5@github.com> On Wed, 18 Jun 2025 15:52:09 GMT, Kim Barrett wrote: >> Please review this change that makes the various code cache/heap size options >> consistently be of type size_t. >> >> The shared declarations for these options were all uintx. These options all >> may have platform-defined values. Some of those platform-specific definitions >> were uintx, some were size_t, and some were intx(!). This change makes them >> all consistently size_t. >> >> More details in the first comment. >> >> Testing: mach5 tier1-6 >> GHA testing in-progress > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'master' into code-cache-sizes > - update copyrights > - remove leftover include > - fix whitebox access to code cache size configs > - VMPageSizeConstraintFunc > - CodeCacheMinBlockLength > - CodeCacheExpansionSize > - various CodeHeapSize options > - CodeCacheMinimumUseSpace > - Initial/ReservedCodeCacheSize > - ... and 1 more: https://git.openjdk.org/jdk/compare/ba32b78b...78f1f7b9 Sorry to request re-review, but there was a merge conflict, which I fixed, and then another merge conflict appeared while I was testing that one. Sigh :( ------------- PR Comment: https://git.openjdk.org/jdk/pull/25791#issuecomment-2985056595 From tschatzl at openjdk.org Wed Jun 18 17:25:32 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 18 Jun 2025 17:25:32 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v4] In-Reply-To: <1TwILxUixcdyTvD3KiI_tZwLiFAVHknRWlab7zxntxE=.0c1fad26-0980-43fc-9125-de686a9ad196@github.com> References: <1TwILxUixcdyTvD3KiI_tZwLiFAVHknRWlab7zxntxE=.0c1fad26-0980-43fc-9125-de686a9ad196@github.com> Message-ID: <73djG5ff1jz5aqEiJ-LuYcpdUrpUJIbG0KEU6bEg4XE=.3816144f-12ca-4ca0-ad63-57dd1de84188@github.com> On Wed, 18 Jun 2025 17:05:48 GMT, Kim Barrett wrote: >> Please review this change that makes the various code cache/heap size options >> consistently be of type size_t. >> >> The shared declarations for these options were all uintx. These options all >> may have platform-defined values. Some of those platform-specific definitions >> were uintx, some were size_t, and some were intx(!). This change makes them >> all consistently size_t. >> >> More details in the first comment. >> >> Testing: mach5 tier1-6 >> GHA testing in-progress > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge branch 'master' into code-cache-sizes > - Merge branch 'master' into code-cache-sizes > - update copyrights > - remove leftover include > - fix whitebox access to code cache size configs > - VMPageSizeConstraintFunc > - CodeCacheMinBlockLength > - CodeCacheExpansionSize > - various CodeHeapSize options > - CodeCacheMinimumUseSpace > - ... and 2 more: https://git.openjdk.org/jdk/compare/984d7f9c...56471cc9 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25791#pullrequestreview-2940105073 From shade at openjdk.org Wed Jun 18 17:30:30 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 18 Jun 2025 17:30:30 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: <1NT3dmBpabeSJ0HMglupev2ONGKaMH9XuMKZYiBwqZw=.32eb98c0-2ed1-46ef-b4b1-166e7d3f791d@github.com> On Mon, 16 Jun 2025 03:30:40 GMT, Ioi Lam wrote: > Background: when writing the string table in the AOT cache, we do this: > > 1. Find out the number of strings in the interned string table > 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. > 3. Enter safepoint > 4. Copy the strings into the arrays > > This bug happened because: > > - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` > - JIT compiler threads may create more interned strings after step 1 > > This PR attempts to fix both issues. I still dislike hooking up to compiler infrastructure to figure out if something is adding interned strings. I really, really dislike the divergence we would introduce with JDK 25 -> JDK 26 once a variant of [JDK-8357473](https://bugs.openjdk.org/browse/JDK-8357473) lands in mainline. I cannot yet think of better solution though, let me think about it some more. At very least we need to get the sequencing of patches right... ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2985136083 From mhaessig at openjdk.org Wed Jun 18 17:31:30 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 18 Jun 2025 17:31:30 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v4] In-Reply-To: <1TwILxUixcdyTvD3KiI_tZwLiFAVHknRWlab7zxntxE=.0c1fad26-0980-43fc-9125-de686a9ad196@github.com> References: <1TwILxUixcdyTvD3KiI_tZwLiFAVHknRWlab7zxntxE=.0c1fad26-0980-43fc-9125-de686a9ad196@github.com> Message-ID: <2t4FoYTdMjLMiVoacfLjNtgyBdPw7SxPNiDfSakE-OA=.c89fe100-8516-4928-894a-15abdbbea2d0@github.com> On Wed, 18 Jun 2025 17:05:48 GMT, Kim Barrett wrote: >> Please review this change that makes the various code cache/heap size options >> consistently be of type size_t. >> >> The shared declarations for these options were all uintx. These options all >> may have platform-defined values. Some of those platform-specific definitions >> were uintx, some were size_t, and some were intx(!). This change makes them >> all consistently size_t. >> >> More details in the first comment. >> >> Testing: mach5 tier1-6 >> GHA testing in-progress > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge branch 'master' into code-cache-sizes > - Merge branch 'master' into code-cache-sizes > - update copyrights > - remove leftover include > - fix whitebox access to code cache size configs > - VMPageSizeConstraintFunc > - CodeCacheMinBlockLength > - CodeCacheExpansionSize > - various CodeHeapSize options > - CodeCacheMinimumUseSpace > - ... and 2 more: https://git.openjdk.org/jdk/compare/984d7f9c...56471cc9 Marked as reviewed by mhaessig (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/25791#pullrequestreview-2940118176 From kvn at openjdk.org Wed Jun 18 17:46:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 18 Jun 2025 17:46:29 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: <1NT3dmBpabeSJ0HMglupev2ONGKaMH9XuMKZYiBwqZw=.32eb98c0-2ed1-46ef-b4b1-166e7d3f791d@github.com> References: <1NT3dmBpabeSJ0HMglupev2ONGKaMH9XuMKZYiBwqZw=.32eb98c0-2ed1-46ef-b4b1-166e7d3f791d@github.com> Message-ID: On Wed, 18 Jun 2025 17:27:29 GMT, Aleksey Shipilev wrote: > I still dislike hooking up to compiler infrastructure to figure out if something is adding interned strings. I really, really dislike the divergence we would introduce with JDK 25 -> JDK 26 once a variant of [JDK-8357473](https://bugs.openjdk.org/browse/JDK-8357473) lands in mainline. I cannot yet think of better solution though, let me think about it some more. At very least we need to get the sequencing of patches right... In long term I want VM to record all used C sting in some global table which AOT code can use indexes to it instead of pointers to strings: https://bugs.openjdk.org/browse/JDK-8337519 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2985179989 From matsaave at openjdk.org Wed Jun 18 17:49:31 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 18 Jun 2025 17:49:31 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:13:38 GMT, Coleen Phillimore wrote: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. LGTM, thanks! ------------- Marked as reviewed by matsaave (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25870#pullrequestreview-2940165788 From sspitsyn at openjdk.org Wed Jun 18 17:55:34 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 18 Jun 2025 17:55:34 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v5] In-Reply-To: <4uEo1C_csdAppvNYFqU1JtgADY-m4as2wBIq1kVq_GA=.6125ab99-8b92-410c-9257-852d8e7eb47e@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> <4uEo1C_csdAppvNYFqU1JtgADY-m4as2wBIq1kVq_GA=.6125ab99-8b92-410c-9257-852d8e7eb47e@github.com> Message-ID: On Wed, 18 Jun 2025 12:37:07 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Matias's comments > - Apply Sergei's comments src/hotspot/share/oops/constantPool.cpp line 1944: > 1942: BSMAttributeEntry* e2 = bsm_attribute_entry(idx2); > 1943: int k1 = e1->bootstrap_method_index(); > 1944: int k2 = cp2->e2->bootstrap_method_index(); I'm kind of confused, this does not look right. It is event not going to be compiled. It is supposed to be as below: BSMAttributeEntry* e2 = cp2->bsm_attribute_entry(idx2); . . . int k2 = e2->bootstrap_method_index(); . . . if (argc == e2->argument_count()) { . . . k2 = e2->argument_index(j); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2155183220 From kvn at openjdk.org Wed Jun 18 17:56:33 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 18 Jun 2025 17:56:33 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: References: Message-ID: <7ZkEZYs5j6Wap4EpD0mZ8FUFv3gvMG-89ozhawptCwo=.93285641-9b37-4fad-9f57-632a58257f6d@github.com> On Mon, 16 Jun 2025 03:30:40 GMT, Ioi Lam wrote: > Background: when writing the string table in the AOT cache, we do this: > > 1. Find out the number of strings in the interned string table > 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. > 3. Enter safepoint > 4. Copy the strings into the arrays > > This bug happened because: > > - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` > - JIT compiler threads may create more interned strings after step 1 > > This PR attempts to fix both issues. I thought your comment was for https://github.com/openjdk/jdk/pull/25841 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-2985211955 From kbarrett at openjdk.org Wed Jun 18 18:08:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 18:08:38 GMT Subject: RFR: 8359227: Code cache/heap size options should be size_t [v4] In-Reply-To: <2t4FoYTdMjLMiVoacfLjNtgyBdPw7SxPNiDfSakE-OA=.c89fe100-8516-4928-894a-15abdbbea2d0@github.com> References: <1TwILxUixcdyTvD3KiI_tZwLiFAVHknRWlab7zxntxE=.0c1fad26-0980-43fc-9125-de686a9ad196@github.com> <2t4FoYTdMjLMiVoacfLjNtgyBdPw7SxPNiDfSakE-OA=.c89fe100-8516-4928-894a-15abdbbea2d0@github.com> Message-ID: <_xv1ASkK38AMhxbRvFFOUUDMQa_nDUB3_pr8GMBq2Jc=.6ca5ef0f-fd83-4d05-83eb-dd8be2ca41a4@github.com> On Wed, 18 Jun 2025 17:28:31 GMT, Manuel H?ssig wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Merge branch 'master' into code-cache-sizes >> - Merge branch 'master' into code-cache-sizes >> - update copyrights >> - remove leftover include >> - fix whitebox access to code cache size configs >> - VMPageSizeConstraintFunc >> - CodeCacheMinBlockLength >> - CodeCacheExpansionSize >> - various CodeHeapSize options >> - CodeCacheMinimumUseSpace >> - ... and 2 more: https://git.openjdk.org/jdk/compare/984d7f9c...56471cc9 > > Marked as reviewed by mhaessig (Author). Thanks for reviews @mhaessig and @tschatzl ------------- PR Comment: https://git.openjdk.org/jdk/pull/25791#issuecomment-2985241813 From kbarrett at openjdk.org Wed Jun 18 18:08:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 18:08:38 GMT Subject: Integrated: 8359227: Code cache/heap size options should be size_t In-Reply-To: References: Message-ID: On Fri, 13 Jun 2025 06:57:20 GMT, Kim Barrett wrote: > Please review this change that makes the various code cache/heap size options > consistently be of type size_t. > > The shared declarations for these options were all uintx. These options all > may have platform-defined values. Some of those platform-specific definitions > were uintx, some were size_t, and some were intx(!). This change makes them > all consistently size_t. > > More details in the first comment. > > Testing: mach5 tier1-6 > GHA testing in-progress This pull request has now been integrated. Changeset: 7bc0d824 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/7bc0d82450e210b14c9f89687582d78a0a50ee54 Stats: 190 lines in 40 files changed: 1 ins; 0 del; 189 mod 8359227: Code cache/heap size options should be size_t Reviewed-by: mhaessig, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/25791 From eastigeevich at openjdk.org Wed Jun 18 19:44:27 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 18 Jun 2025 19:44:27 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v2] In-Reply-To: <8nXpApdLxXidwKfFpcVbKjpYgOn5EfhUvKNQRKvv2o0=.252bc291-3219-4d77-9a4d-8fe75952c2f6@github.com> References: <8nXpApdLxXidwKfFpcVbKjpYgOn5EfhUvKNQRKvv2o0=.252bc291-3219-4d77-9a4d-8fe75952c2f6@github.com> Message-ID: On Wed, 18 Jun 2025 07:35:47 GMT, Beno?t Maillard wrote: >> Yes we use -std=c++14, but creating a negative value in this way still feels like a kind of overflow to me. > > Thanks for the comments! > > I added the assert because the issue in the JBS mentioned a specific case where we ended up with negative values. > > Should I leave it like this, or rather convert it to a more specific check (ie. making sure that the `LogBytesPerLong - log2_esize` most significant bits are not used **before** shifting)? IMO your assert is obfuscating the overflow problem. I think the assert should be before doing the shift. It can be like: assert((fast_size_limit == 0) || (count_leading_zeros(fast_size_limit) > (LogBytesPerLong - log2_esize), "fast_size_limit (%d) overflow when shifted left by %d", fast_size_limit, (LogBytesPerLong - log2_esize)); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2155369775 From mdoerr at openjdk.org Wed Jun 18 19:59:29 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 18 Jun 2025 19:59:29 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v8] In-Reply-To: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> References: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> Message-ID: <7dqmRsrYN1O1zVQSYvFbVpmjSSABHnw7dL8QXSkrOtE=.eceed621-7482-436e-a999-73b6d2611616@github.com> On Tue, 17 Jun 2025 20:59:46 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > 2nd try at arm fix Unfortunately, there's a merge conflict, now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2985526434 From mdoerr at openjdk.org Wed Jun 18 19:59:30 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 18 Jun 2025 19:59:30 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 11:34:02 GMT, Erik ?sterlund wrote: >> If I understand your concern correctly, there is no race. The only caller of BarrierSetNMethod::make_not_entrant() is nmethod::make_not_entrant(), and it is done inside a NMethodState_lock critical section. After a call to nmethod::make_not_entrant(), the nmethod entry barrier is armed and stays that way. >> And by design, a disarm only disarms at the inner nmethod_entry_barrier level, not the outer nmethod_stub_entry_barrier level. > > My concern is that while thread 1 calls nmethod::make_not_entrant(), thread 2 racingly performs nmethod entry barrier; it makes the is_not_entrant check before it gets updated, but then it gets updated as the per nmethod lock is taken. The GC code "disarms" the GC barrier but in doing so finds that "oh this should be not entrant", but that's sort of not reflected as thread 2 will then proceed with entering the nmethod it just armed as not entrant in the nmethod entry barrier code. > Does that make sense? Doesn't the old code have the same limitation? If thread 1 patches the entry point after thread 2 has executed the first instruction, thread 2 will be inside the nmethod if GC has disarmed the nmethod entry barrier. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2155390909 From eosterlund at openjdk.org Wed Jun 18 20:28:29 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 18 Jun 2025 20:28:29 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 19:56:27 GMT, Martin Doerr wrote: >> My concern is that while thread 1 calls nmethod::make_not_entrant(), thread 2 racingly performs nmethod entry barrier; it makes the is_not_entrant check before it gets updated, but then it gets updated as the per nmethod lock is taken. The GC code "disarms" the GC barrier but in doing so finds that "oh this should be not entrant", but that's sort of not reflected as thread 2 will then proceed with entering the nmethod it just armed as not entrant in the nmethod entry barrier code. >> Does that make sense? > > Doesn't the old code have the same limitation? If thread 1 patches the entry point after thread 2 has executed the first instruction, thread 2 will be inside the nmethod if GC has disarmed the nmethod entry barrier. Well, yeah sort of. And hence the comment that it's probably fine in terms of correctness. They were also a bit more independent systems then though. Just thought that if we now take the step to merge compiler and GC entry trap mechanisms into the nmethod entry barrier, that we could seemingly also make it a bit less slippery here and establish some sort of invariant that if we while holding the lock protecting the entry barrier find that the nmethod entry barrier is not entrant, for whatever reason, we should not enter it. Would make it easier to understand the code I suspect. What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2155434971 From kbarrett at openjdk.org Wed Jun 18 21:24:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 21:24:28 GMT Subject: RFR: 8359924: Deprecate and obsolete ParallelRefProcEnabled In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:04:28 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcEnabled`, which is used only by Parallel and G1, and both have it enabled by default via: > > > if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > 1) { > FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); > } > > > Disabling it offers little benefit and its presence incurs some implementation complexity in the reference-processor. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25875#pullrequestreview-2940716200 From kbarrett at openjdk.org Wed Jun 18 21:29:27 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 18 Jun 2025 21:29:27 GMT Subject: RFR: 8359924: Deprecate and obsolete ParallelRefProcEnabled In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:47:13 GMT, Thomas Schatzl wrote: > I'm wondering whether we should make `ParallelRefProcBalancingEnabled` diagnostic at the same time? (Separately). I think it's too late for that. We can only deprecate a product option in the current release, we can't change it to diagnostic (which is removing it from being available for normal usage). We could deprecate now with the intent of going to diagnostic rather than removal, but I'd rather we just deprecate for removal, and remove in the next release. > Same with `ReferencesPerThread`, again separately, but that does not need a CSR. We could remove ReferencesPerThread immediately, without first deprecating, since it's experimental. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25875#issuecomment-2985742153 From sspitsyn at openjdk.org Wed Jun 18 22:31:32 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 18 Jun 2025 22:31:32 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:46:56 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add a basic gtest. The update looks good. I've posted several nits though. src/hotspot/share/classfile/classLoaderData.cpp line 605: > 603: JmethodIDTable::remove(mid); > 604: } > 605: delete _jmethod_ids; Nit: I'd null out the `_jmethod_ids` field. src/hotspot/share/memory/universe.cpp line 440: > 438: vmSymbols::initialize(); > 439: > 440: // Initialize table for matching jmethodID, before SystemDictionary Nit: Dot is missed. src/hotspot/share/oops/instanceKlass.cpp line 4280: > 4278: } > 4279: > 4280: // This nulls out jmethodIDs for all obsolete methods in the previous version of the 'klass' Nit: Dot is missed. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2940857934 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2155610947 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2155613281 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2155617141 From sspitsyn at openjdk.org Wed Jun 18 22:31:33 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 18 Jun 2025 22:31:33 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v4] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:00:43 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/jmethodIDTable.hpp line 48: >> >>> 46: static void remove(jmethodID mid); >>> 47: >>> 48: // RedefineClasses support >> >> Nit: Add a dot at the end of the comment for consistency with other comments. > > This isn't a sentence and this pattern is in the sources in a lot of places, none of these places have a period at the end. Okay ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2155619399 From dlong at openjdk.org Wed Jun 18 23:08:32 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 18 Jun 2025 23:08:32 GMT Subject: RFR: 8358329: AArch64: emit direct branches in static stubs for small code caches [v3] In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 15:30:48 GMT, Mikhail Ablakatov wrote: >> In the A64 ISA, the B (direct branch) instruction can encode a target within a ?128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. >> >> This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. >> >> Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: >> >> | Metric | Before | After | Difference | >> |-------------|---------------|---------------|------------| >> | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | >> | | Sum: 6653848 | Sum: 6616344 | -0.56% | >> | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | >> | | Sum: 364376 | Sum: 308552 | -15.33% | >> >> Full jtreg passed on AArch64. > > Mikhail Ablakatov has updated the pull request incrementally with one additional commit since the last revision: > > cleanup: update a copyright notice > > Co-authored-by: Andrew Haley This is causing failures in Oracle tier5 testing. See JDK-8359963. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25702#issuecomment-2985983895 From dlong at openjdk.org Thu Jun 19 00:22:30 2025 From: dlong at openjdk.org (Dean Long) Date: Thu, 19 Jun 2025 00:22:30 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 20:25:44 GMT, Erik ?sterlund wrote: >> Doesn't the old code have the same limitation? If thread 1 patches the entry point after thread 2 has executed the first instruction, thread 2 will be inside the nmethod if GC has disarmed the nmethod entry barrier. > > Well, yeah sort of. And hence the comment that it's probably fine in terms of correctness. They were also a bit more independent systems then though. Just thought that if we now take the step to merge compiler and GC entry trap mechanisms into the nmethod entry barrier, that we could seemingly also make it a bit less slippery here and establish some sort of invariant that if we while holding the lock protecting the entry barrier find that the nmethod entry barrier is not entrant, for whatever reason, we should not enter it. Would make it easier to understand the code I suspect. What do you think? I think making it less slippery in one place but still leaving other races gives a false sense of security and makes the code harder to understand. Arming the barrier is not guaranteed to be visible until there is a safepoint. Note that AArch64 and RISCV only call increment_patching_epoch() when the guard value is set to the disarmed value, so there is no invalidation of the CPU pipeline or instruction buffer (cross modification fence) when arming. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2155757120 From dholmes at openjdk.org Thu Jun 19 00:44:30 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 00:44:30 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:46:56 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add a basic gtest. Still working my way through this ... Just to be sure I have the correct mental picture of this, things work as follow: - a `jMethodID` is just a unique integer index to represent a `Method` for use with JNI - when we allocate a new `jMethodID` to a `Method` we - add that mapping to the global table - add an entry to the `ClassLoaderData`'s `jMethodID` "list" - each `instanceklass` also maintains a "cache" of its own `jMethodID` mappings (some of which need updating on method redefinition) - when a class is unloaded we deallocate the cache, deallocate the CLD list, and remove the table entries - when a JNI API is presented with a `jMethodID` by the caller, it validates it by looking it up in the table, to see if the mapping exists The cache is created under a lock, and `jMethodID`s are created and added to the cache (and the table, and the CLD list) under the same lock. But access to the cache is lock-free using acquire/release. The cache is also destroyed under the same lock, and the table entries removed under that same lock, and the CLD list deallocated. Is the above correct? What details have I missed? Thanks src/hotspot/share/oops/instanceKlass.cpp line 2395: > 2393: > 2394: // Allocate the jmethodID cache. > 2395: static jmethodID* create_jmethod_id_cache(size_t size) { Why isn't this used at line 2439 to create the (initial?) cache? src/hotspot/share/oops/jmethodIDTable.cpp line 40: > 38: static uint64_t _jmethodID_counter = 0; > 39: // Tracks the number of jmethodID entries in the _jmethod_id_table. > 40: // Incremented on insert, decremented on remove. Use to track if we need to resize the table. Suggestion: // Incremented on insert, decremented on remove. Used to track if we need to resize the table. src/hotspot/share/oops/jmethodIDTable.cpp line 141: > 139: // Update jmethodID global counter. > 140: _jmethodID_counter++; > 141: guarantee(_jmethodID_counter != 0, "must never go back to zero"); Again a guarantee is not needed here as the only possible way this could trigger is if we initialize it incorrectly./ ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2940758855 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2155747747 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2155550987 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2155763677 From dholmes at openjdk.org Thu Jun 19 00:48:37 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 00:48:37 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:46:56 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add a basic gtest. I wrote above: > when a JNI API is presented with a jMethodID by the caller, it validates it by looking it up in the table, to see if the mapping exists This still seems racy though. What if the lookup succeeds but at the same time the class is to be unloaded? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2986150357 From dholmes at openjdk.org Thu Jun 19 01:23:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 01:23:32 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: <8_CoRfbkgRom4_8wkrguW6tldAsW1L3EZus3tchTElM=.3bddb374-5d98-4f7b-9d9c-5f413b3c23aa@github.com> References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> <-M2-0zU6PnzfWIcFizGXyg9xeVGVrJVRvJVkHDy1-w0=.0702833c-7ed6-41bf-8c49-dc0bb2e1503b@github.com> <8_CoRfbkgRom4_8wkrguW6tldAsW1L3EZus3tchTElM=.3bddb374-5d98-4f7b-9d9c-5f413b3c23aa@github.com> Message-ID: On Wed, 18 Jun 2025 08:16:36 GMT, Kim Barrett wrote: >>> `+ // We know that k+n <= (int)0x7fe, and might be negative if n is negative.` >> >> It can be negative if `n` is positive too. > >> But that is not correct - we should only take this "overflow" path for >> `((k+n) > 0x7fe && (k+n) <= INT_MAX)`. Your suggestion makes us take this >> path if `(k+n)` overflows to negative. ?? > > It is intentional that the new test is true for the `(k+n)` => overflow case. > It fully handles the overflow case, eliminating the need for the later fixup > of the case where `((k <= -54) && (n > 5000))` (though `(n > 0)` would work > just as well; I don't know why that `5000` was inserted). That fixup returned > the same `hugeX`-based result as here. > >> It can be negative if n is positive too. > > `(k+n)` cannot be negative with `n` positive here, even under wrapping > semantics, because we can't get here in that case due to the prior overflow > detection. Okay @kimbarrett I suggest that you take over this issue and PR. I do not have the knowledge of the code that you have and I cannot affirm that your statements are correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2155825085 From dholmes at openjdk.org Thu Jun 19 01:23:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 01:23:32 GMT Subject: Withdrawn: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 07:48:03 GMT, David Holmes wrote: > This fixes address a problem with signed integer overflow in the C fdlibm scalbnA function. > > Testing this code is extremely difficult. First, the only time this code will get executed is if intrinsics have been disabled by `-XX:-InlineIntrinsics`. Second, finding the math routines and the arguments thereto which actually reach this function is also difficult. I have found 3 tests only that hit the `scalbnA` function at the point where the potential overflow occurs, but beyond that I cannot determine what arguments will cause the different code paths to be taken. Consequently the only testing I could do here was to make a copy of the original `scalbnA` function and then place a check in the callers that the old and new code produced the same result. Again how much coverage this actually gave is not known. That test code still remains in the PR as the initial commit. > > Due to the testing problem this test relies on detailed code inspection and analysis, so here are the changes and the reasoning for them: > > // Convert to unsigned to avoid signed integer overflow > [1] unsigned u_k = ((unsigned) k) + n; > > [2] if (u_k > 0x7fe && u_k <= 0x7fffffff) return hugeX*copysignA(hugeX,x); /* overflow */ > [3] if (u_k > 0 && u_k <= 0x7fe) { /* normal result */ > [4] set_high(&x, (hx&0x800fffff)|((k+n)<<20)); > return x; > } > > [5] if (u_k <= (unsigned)-54) { > if (n > 50000) /* in case integer overflow in n+k */ > return hugeX*copysignA(hugeX,x); /*overflow*/ > else return tiny*copysignA(tiny,x); /*underflow*/ > } > [6] k = u_k + 54; /* subnormal result */ > set_high(&x, (hx&0x800fffff)|(k<<20)); > return x*twom54; > > > [1] We use an unsigned variable, `u_k`, for the potentially overflowing addition > > [2] We check the value of `u_k` adjusting the bounds to emulate a signed-int range > > [3] Again we check `u_k` and adjust the range > > [4] We know `k+n` is in range so we use that directly. I didn't use `u_k` here because I didn't want to have to reason about whether the use of an unsigned type would change anything in the expression > > [5] We check if `u_k` is logically less than what -54 would be > > [6] We bring `u_k` back into positive range by adding 54 and then store safely into `k` > > Thanks. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/25656 From sparasa at openjdk.org Thu Jun 19 05:21:13 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Thu, 19 Jun 2025 05:21:13 GMT Subject: RFR: 8359965: Enable paired pushp and popp instruction usage for APX enabled CPUs Message-ID: The goal of this PR is to enhance the existing x86 assembly stubs using PUSH and POP instructions with paired PUSHP/POPP instructions which are part of Intel APX technology. In Intel APX, the PUSHP and POPP instructions are modern, compact replacements for the legacy PUSH and POP, designed to work seamlessly with the expanded set of 32 general-purpose registers (R0?R31). Unlike their predecessors, they use the new APX (REX2-based) encoding, enabling more uniform and efficient instruction formats. These instructions improve code density, simplify register access, and are optimized for performance on APX-enabled CPUs. Pairing PUSHP and POPP in Intel APX provides CPU-level benefits such as more efficient instruction decoding, better stack pointer tracking, and improved register dependency management. Their uniform encoding allows for streamlined execution, reduced pipeline stalls, and potential micro-op fusion, all of which enhance performance and power efficiency. This pairing helps the processor optimize speculative execution and register lifetimes, making code faster and more scalable on modern architectures. ------------- Commit messages: - 8359965: Enable paired pushp and popp instruction usage for APX enabled CPUs Changes: https://git.openjdk.org/jdk/pull/25889/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25889&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359965 Stats: 384 lines in 23 files changed: 20 ins; 0 del; 364 mod Patch: https://git.openjdk.org/jdk/pull/25889.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25889/head:pull/25889 PR: https://git.openjdk.org/jdk/pull/25889 From dholmes at openjdk.org Thu Jun 19 06:28:36 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 06:28:36 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:13:38 GMT, Coleen Phillimore wrote: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. src/hotspot/share/classfile/stackMapTable.hpp line 154: > 152: SAME_FRAME = 64, > 153: SAME_LOCALS_1_STACK_ITEM_FRAME = 128, > 154: SAME_LOCALS_1_STACK_ITEM_EXTENDED = 247, I find these definitions a little confusing. SAME_FRAME is actually 0-63, with SAME_LOCALS_1_STACK_ITEM_FRAME being 64-127. Given many of these frame types imply tag ranges it may be clearer to define enum's for the start and end of ranges as applicable eg. enum { SAME_FRAME_START = 0, SAME_FRAME_END = 63, SAME_LOCALS_1_STACK_ITEM_FRAME_START = 64, SAME_LOCALS_1_STACK_ITEM_FRAME_END = 127, RESERVED_START = 128, RESERVED_END = 246, SAME_LOCALS_1_STACK_ITEM_EXTENDED = 247, CHOP_FRAME_START = 248, CHOP_FRAME_END = 250, SAME_FRAME_EXTENDED = 251, APPEND_FRAME_START = 252, APPEND_FRAME_END = 254, FULL_FRAME = 255 } and then adjust the code usage as appropriate e.g. if (frame_type <= SAME_FRAME_END) { ... if (frame_type <= SAME_LOCALS_1_STACK_ITEM_FRAME_END) { if (_first) { offset = frame_type - SAME_LOCALS_1_STACK_ITEM_FRAME_START; ... What do you think? src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 3321: > 3319: // u1 frame_type = APPEND; /* 252-254 */ > 3320: // u2 offset_delta; > 3321: // verification_type_info locals[frame_type - 251]; Suggestion: // verification_type_info locals[frame_type - SAME_EXTENDED]; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25870#discussion_r2156218466 PR Review Comment: https://git.openjdk.org/jdk/pull/25870#discussion_r2156184106 From dholmes at openjdk.org Thu Jun 19 06:34:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 06:34:32 GMT Subject: RFR: 8359924: Deprecate and obsolete ParallelRefProcEnabled In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:04:28 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcEnabled`, which is used only by Parallel and G1, and both have it enabled by default via: > > > if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > 1) { > FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); > } > > > Disabling it offers little benefit and its presence incurs some implementation complexity in the reference-processor. LGTM! Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25875#pullrequestreview-2941745997 From stuefe at openjdk.org Thu Jun 19 06:36:40 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 19 Jun 2025 06:36:40 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:46:56 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add a basic gtest. I feel apprehensive about this; the solution feels pretty complex and I am not fully convinced this is the simplest solution for this problem. How much space to we lose in real life? Side note: I see the payload of the jmethodID block in NMT is allocated with mtInternal, so we don't see it in NMT. We should add jmethodIDs as an own category to NMT. A pragmatic alternative solution could be to do delete them, but delayed: keep the last N methodblocks undeleted. It is rare that JNI accesses jmethodIDs long after they have been deleted. Typically, the bad access happens close after class unloading, e.g. because of concurrency problems in customer code. We could then make the parameter N configurable, and thus give customers and supporters a tool to check for these kind of errors. (I briefly wondered whether we could just mmap these blocks, and uncommit/mprotect them on release, so that we stop paying the memory costs but don't release the address space; but the coarser page size allocation granularity would make this probably forbidding in terms of mem cost per class) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2986801229 From dholmes at openjdk.org Thu Jun 19 06:42:53 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 06:42:53 GMT Subject: RFR: 8359965: Enable paired pushp and popp instruction usage for APX enabled CPUs In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 05:15:40 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to enhance the existing x86 assembly stubs using PUSH and POP instructions with paired PUSHP/POPP instructions which are part of Intel APX technology. > > In Intel APX, the PUSHP and POPP instructions are modern, compact replacements for the legacy PUSH and POP, designed to work seamlessly with the expanded set of 32 general-purpose registers (R0?R31). Unlike their predecessors, they use the new APX (REX2-based) encoding, enabling more uniform and efficient instruction formats. These instructions improve code density, simplify register access, and are optimized for performance on APX-enabled CPUs. > > Pairing PUSHP and POPP in Intel APX provides CPU-level benefits such as more efficient instruction decoding, better stack pointer tracking, and improved register dependency management. Their uniform encoding allows for streamlined execution, reduced pipeline stalls, and potential micro-op fusion, all of which enhance performance and power efficiency. This pairing helps the processor optimize speculative execution and register lifetimes, making code faster and more scalable on modern architectures. Just a drive-by comment as this isn't code I normally have much to do with but to me it would look a lot cleaner to define `push_paired`/`pop_paired` (maybe abbreviating directly to `pushp`/`popp`?) rather than passing the boolean. ------------- PR Review: https://git.openjdk.org/jdk/pull/25889#pullrequestreview-2941765167 From dholmes at openjdk.org Thu Jun 19 06:53:40 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Jun 2025 06:53:40 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 06:33:08 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add a basic gtest. > > I feel apprehensive about this; the solution feels pretty complex and I am not fully convinced this is the simplest solution for this problem. > > How much space to we lose in real life? Side note: I see the payload of the jmethodID block in NMT is allocated with mtInternal, so we don't see it in NMT. We should add jmethodIDs as an own category to NMT. > > A pragmatic alternative solution could be to do delete them, but delayed: keep the last N methodblocks undeleted. It is rare that JNI accesses jmethodIDs long after they have been deleted. Typically, the bad access happens close after class unloading, e.g. because of concurrency problems in customer code. > > We could then make the parameter N configurable, and thus give customers and supporters a tool to check for these kind of errors. > > (I briefly wondered whether we could just mmap these blocks, and uncommit/mprotect them on release, so that we stop paying the memory costs but don't release the address space; but the coarser page size allocation granularity would make this probably forbidding in terms of mem cost per class) @tstuefe, so at the moment we maintain safety for use-after-unload but at the expense of storage. Coleen's proposal maintains the same level of safety but reclaims the storage. You are suggesting a "simpler" technique to reclaim the storage by reducing the level of safety. I'd prefer to not re-open the door to unsafe usage, no matter how unlikely it may be. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2986841710 From jsjolen at openjdk.org Thu Jun 19 07:12:16 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 19 Jun 2025 07:12:16 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v5] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> <4uEo1C_csdAppvNYFqU1JtgADY-m4as2wBIq1kVq_GA=.6125ab99-8b92-410c-9257-852d8e7eb47e@github.com> Message-ID: <8Lvgf50diQvjHNgQSrZQowphu92FZM3M2Z7Fdi1DY5w=.8f7dca64-8cce-48ae-b739-fb609076b07f@github.com> On Wed, 18 Jun 2025 17:51:14 GMT, Serguei Spitsyn wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Matias's comments >> - Apply Sergei's comments > > src/hotspot/share/oops/constantPool.cpp line 1944: > >> 1942: BSMAttributeEntry* e2 = bsm_attribute_entry(idx2); >> 1943: int k1 = e1->bootstrap_method_index(); >> 1944: int k2 = cp2->e2->bootstrap_method_index(); > > I'm kind of confused, this does not look right. It is event not going to be compiled. > It is supposed to be as below: > > BSMAttributeEntry* e2 = cp2->bsm_attribute_entry(idx2); > . . . > int k2 = e2->bootstrap_method_index(); > . . . > if (argc == e2->argument_count()) { > . . . > k2 = e2->argument_index(j); Gah, I accidentally made a mistake with the refactoring. Of course, when I pushed I thought "this change is so simple, no need to check it before pushing" :-). Let me fix that (and compile it myself this time) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2156295434 From jsjolen at openjdk.org Thu Jun 19 07:35:33 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 19 Jun 2025 07:35:33 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v6] In-Reply-To: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: > Hi, > > The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: > > ```c++ > struct BSMAE { > u2 bootstrap_method_index; > u2 argument_count; > u2 arguments[argument_count]; > } > > > We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. > > Please consider! > > Testing: Currently GHA, running tier1-tier3 Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Double check your changes next time! ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25298/files - new: https://git.openjdk.org/jdk/pull/25298/files/af3caa9b..d891a3d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25298/head:pull/25298 PR: https://git.openjdk.org/jdk/pull/25298 From sspitsyn at openjdk.org Thu Jun 19 07:35:34 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 19 Jun 2025 07:35:34 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v5] In-Reply-To: <8Lvgf50diQvjHNgQSrZQowphu92FZM3M2Z7Fdi1DY5w=.8f7dca64-8cce-48ae-b739-fb609076b07f@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> <4uEo1C_csdAppvNYFqU1JtgADY-m4as2wBIq1kVq_GA=.6125ab99-8b92-410c-9257-852d8e7eb47e@github.com> <8Lvgf50diQvjHNgQSrZQowphu92FZM3M2Z7Fdi1DY5w=.8f7dca64-8cce-48ae-b739-fb609076b07f@github.com> Message-ID: On Thu, 19 Jun 2025 07:06:54 GMT, Johan Sj?len wrote: >> src/hotspot/share/oops/constantPool.cpp line 1944: >> >>> 1942: BSMAttributeEntry* e2 = bsm_attribute_entry(idx2); >>> 1943: int k1 = e1->bootstrap_method_index(); >>> 1944: int k2 = cp2->e2->bootstrap_method_index(); >> >> I'm kind of confused, this does not look right. It is event not going to be compiled. >> It is supposed to be as below: >> >> BSMAttributeEntry* e2 = cp2->bsm_attribute_entry(idx2); >> . . . >> int k2 = e2->bootstrap_method_index(); >> . . . >> if (argc == e2->argument_count()) { >> . . . >> k2 = e2->argument_index(j); > > Gah, I accidentally made a mistake with the refactoring. Of course, when I pushed I thought "this change is so simple, no need to check it before pushing" :-). Let me fix that (and compile it myself this time) It's okay unless you have not integrated the update. :) Submitting at least 3 first mach5 tiers before integration will keep you out of potential trouble. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2156332326 From sspitsyn at openjdk.org Thu Jun 19 07:38:23 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 19 Jun 2025 07:38:23 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v6] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Thu, 19 Jun 2025 07:35:33 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Double check your changes next time! Thank you for the update. The fix looks good. I've posted one more nit though. src/hotspot/share/prims/jvmtiClassFileReconstituter.cpp line 414: > 412: write_u2(num_bootstrap_arguments); > 413: for (int arg = 0; arg < num_bootstrap_arguments; arg++) { > 414: u2 bootstrap_argument = cpool()->bsm_attribute_entry(n)->argument_index(arg); Nit: This line can also use the `bsme` local: `u2 bootstrap_argument = bsme->argument_index(arg);` Sorry, I missed this before. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25298#pullrequestreview-2941919683 PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2156344933 From stefank at openjdk.org Thu Jun 19 07:47:41 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 19 Jun 2025 07:47:41 GMT Subject: RFR: 8354954: Typed static memory for late initialization of static class members in Hotspot [v11] In-Reply-To: <_M6EE5EQPt0LLK3J8sdwfjErJazlzW08JPlAvX9JOp8=.463cf4d8-c9f9-4db4-bf39-57c700275d20@github.com> References: <_M6EE5EQPt0LLK3J8sdwfjErJazlzW08JPlAvX9JOp8=.463cf4d8-c9f9-4db4-bf39-57c700275d20@github.com> Message-ID: On Wed, 16 Apr 2025 17:37:16 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/memoryFileTracker.cpp line 129: >> >>> 127: bool MemoryFileTracker::Instance::initialize(NMT_TrackingLevel tracking_level) { >>> 128: if (tracking_level == NMT_TrackingLevel::NMT_off) return true; >>> 129: new (_tracker.as()) MemoryFileTracker(tracking_level == NMT_TrackingLevel::NMT_detail); >> >> Maybe you could add an `init` function that forwards the constructor arguments, with an extra check to see if the memory has already been initialized: >> >> >> template >> void init(As&&...args) { >> assert(is_death_pattern(), "StaticArea already initialized"); >> new (as()) T(std::forward(args)...); >> } >> >> >> Suggestion: >> >> _tracker.init(tracking_level == NMT_TrackingLevel::NMT_detail); > > I think that's a good idea. Unfortunately, move semantics and Rvalue references are currently undecided in the style guide, so we can't write this exact code. We can still do > > ```c++ > template > void init(As&... args) { > assert(is_death_pattern(), "StaticArea already initialized"); > new (as()) T(args...); > } > > > Which is pretty good. FWIW, the current implementation doesn't allow me to do: struct Thing { Thing(int value) {} }; ... Defered _deferred; ... _deferred.initialize(1); I have to write last piece as: int temp = 1; _deferred.initialize(temp); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24689#discussion_r2156365079 From sspitsyn at openjdk.org Thu Jun 19 07:56:41 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 19 Jun 2025 07:56:41 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:13:38 GMT, Coleen Phillimore wrote: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. This looks good. I was also thinking about `frame_type` ranges but no ideas on any improvements. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25870#pullrequestreview-2941977212 From jsjolen at openjdk.org Thu Jun 19 08:02:24 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 19 Jun 2025 08:02:24 GMT Subject: RFR: 8354954: Typed static memory for late initialization of static class members in Hotspot [v11] In-Reply-To: References: <_M6EE5EQPt0LLK3J8sdwfjErJazlzW08JPlAvX9JOp8=.463cf4d8-c9f9-4db4-bf39-57c700275d20@github.com> Message-ID: On Thu, 19 Jun 2025 07:44:38 GMT, Stefan Karlsson wrote: >> I think that's a good idea. Unfortunately, move semantics and Rvalue references are currently undecided in the style guide, so we can't write this exact code. We can still do >> >> ```c++ >> template >> void init(As&... args) { >> assert(is_death_pattern(), "StaticArea already initialized"); >> new (as()) T(args...); >> } >> >> >> Which is pretty good. > > FWIW, the current implementation doesn't allow me to do: > > struct Thing { > Thing(int value) {} > }; > ... > Defered _deferred; > ... > _deferred.initialize(1); > > I have to write last piece as: > > int temp = 1; > _deferred.initialize(temp); Yeah, and the current implementation will copy-construct its argumens as far as I understand. So, we gain nothing. I'd really like to see us getting in move semantics so that we can have ```c++ template void init(As&&... args) { assert(is_death_pattern(), "StaticArea already initialized"); new (as()) T(args...); } et voil?, it'll work as we want it to. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24689#discussion_r2156389328 From jsjolen at openjdk.org Thu Jun 19 08:03:22 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 19 Jun 2025 08:03:22 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v6] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Thu, 19 Jun 2025 07:33:46 GMT, Serguei Spitsyn wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Double check your changes next time! > > src/hotspot/share/prims/jvmtiClassFileReconstituter.cpp line 414: > >> 412: write_u2(num_bootstrap_arguments); >> 413: for (int arg = 0; arg < num_bootstrap_arguments; arg++) { >> 414: u2 bootstrap_argument = cpool()->bsm_attribute_entry(n)->argument_index(arg); > > Nit: This line can also use the `bsme` local: `u2 bootstrap_argument = bsme->argument_index(arg);` > Sorry, I missed this before. Thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2156382285 From jsjolen at openjdk.org Thu Jun 19 08:03:20 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 19 Jun 2025 08:03:20 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v7] In-Reply-To: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: > Hi, > > The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: > > ```c++ > struct BSMAE { > u2 bootstrap_method_index; > u2 argument_count; > u2 arguments[argument_count]; > } > > > We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. > > Please consider! > > Testing: Currently GHA, running tier1-tier3 Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25298/files - new: https://git.openjdk.org/jdk/pull/25298/files/d891a3d3..5d7e46ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25298&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25298/head:pull/25298 PR: https://git.openjdk.org/jdk/pull/25298 From qamai at openjdk.org Thu Jun 19 08:23:56 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 08:23:56 GMT Subject: RFR: 8354954: Typed static memory for late initialization of static class members in Hotspot [v11] In-Reply-To: References: <_M6EE5EQPt0LLK3J8sdwfjErJazlzW08JPlAvX9JOp8=.463cf4d8-c9f9-4db4-bf39-57c700275d20@github.com> Message-ID: On Thu, 19 Jun 2025 07:57:46 GMT, Johan Sj?len wrote: >> FWIW, the current implementation doesn't allow me to do: >> >> struct Thing { >> Thing(int value) {} >> }; >> ... >> Defered _deferred; >> ... >> _deferred.initialize(1); >> >> I have to write last piece as: >> >> int temp = 1; >> _deferred.initialize(temp); > > Yeah, > > and the current implementation will copy-construct its argumens as far as I understand. So, we gain nothing. I'd really like to see us getting in move semantics so that we can have > > ```c++ > template > void init(As&&... args) { > assert(is_death_pattern(), "StaticArea already initialized"); > new (as()) T(args...); > } > > > et voil?, it'll work as we want it to. This seems like something we overlooked, should this be: template void initialize(const Ts&... args); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24689#discussion_r2156433500 From sspitsyn at openjdk.org Thu Jun 19 09:03:30 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 19 Jun 2025 09:03:30 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v7] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Thu, 19 Jun 2025 08:03:20 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Fix Marked as reviewed by sspitsyn (Reviewer). src/hotspot/share/cds/aotConstantPoolResolver.cpp line 517: > 515: > 516: int bsms_attribute_index = cp->bootstrap_methods_attribute_index(cp_index); > 517: int arg_count = cp->bsm_attribute_entry(bsms_attribute_index)->argument_count(); Your fix made it possible to do a bit more simplifications. For instance, each `bsms_attribute_index` parameter can be replaced with a `bsme` parameter. But this does not look that important at the moment. ------------- PR Review: https://git.openjdk.org/jdk/pull/25298#pullrequestreview-2942179152 PR Review Comment: https://git.openjdk.org/jdk/pull/25298#discussion_r2156509715 From shade at openjdk.org Thu Jun 19 10:06:45 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 19 Jun 2025 10:06:45 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: <4QtV9UGiRdP0LG6rIHH_4fwO2f5wm0evnHnut2nxvjM=.ecf61194-7201-4c7e-a496-3ddc821ea079@github.com> On Wed, 18 Jun 2025 15:42:14 GMT, Radim Vansa wrote: > @shipilev indicated that the backport to 21 should wait a bit, could you clarify when should I file that (e.g. end of July, ...)? I would say for the fairly big change like this, we want to wait until JDK 25 GA (that would pass the all-tests-run). It would be too late for Oct 2025 release, though. So realistically, this would target January 2026 release. You can pull this patch to your downstream JDK 21 to see if there are any troubles ahead of this path, this will also soothe 21u maintainer concerns, I think. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-2987501194 From kbarrett at openjdk.org Thu Jun 19 10:24:17 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 19 Jun 2025 10:24:17 GMT Subject: RFR: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction [v2] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 12:04:12 GMT, Kim Barrett wrote: >> Please review this change to the HotSpot Style Guide to add discussion of how >> we prefer to handle initialization and destruction of non-local variables. >> >> I propose this is an editorial change, as it just documents current practice >> rather than suggesting a change to current practice. As such, the normal >> HotSpot PR process applies. >> >> The updated .html file was generated using make update-build-docs. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > better terminology, merge separate sections Thanks all for reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25812#issuecomment-2987548127 From kbarrett at openjdk.org Thu Jun 19 10:24:18 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 19 Jun 2025 10:24:18 GMT Subject: Integrated: 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction In-Reply-To: References: Message-ID: On Sun, 15 Jun 2025 05:15:11 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide to add discussion of how > we prefer to handle initialization and destruction of non-local variables. > > I propose this is an editorial change, as it just documents current practice > rather than suggesting a change to current practice. As such, the normal > HotSpot PR process applies. > > The updated .html file was generated using make update-build-docs. This pull request has now been integrated. Changeset: 01d4b772 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/01d4b772dee8470188793676ce983d6203c7fefb Stats: 69 lines in 2 files changed: 56 ins; 13 del; 0 mod 8319242: HotSpot Style Guide should discourage non-local variables with non-trivial initialization or destruction Reviewed-by: stefank, dcubed, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25812 From qamai at openjdk.org Thu Jun 19 11:12:21 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 11:12:21 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot Message-ID: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Hi, This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. In addition, I make some improvements to `GrowableArrayIterator`: - Make a non-const variant (our current iterator is const only). - Add various utility operators to align with a typical iterator. Please take a look and share your thoughts. Thanks very much. ------------- Commit messages: - add insertion sort and modernize GrowableArrayIterator Changes: https://git.openjdk.org/jdk/pull/25895/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360023 Stats: 231 lines in 3 files changed: 219 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25895.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895 PR: https://git.openjdk.org/jdk/pull/25895 From qamai at openjdk.org Thu Jun 19 11:39:56 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 11:39:56 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v2] In-Reply-To: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: > Hi, > > This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. > > In addition, I make some improvements to `GrowableArrayIterator`: > > - Make a non-const variant (our current iterator is const only). > - Add various utility operators to align with a typical iterator. > > Please take a look and share your thoughts. Thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: fix windows build failures ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25895/files - new: https://git.openjdk.org/jdk/pull/25895/files/1ff7b27a..32f48e21 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=00-01 Stats: 9 lines in 1 file changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/25895.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895 PR: https://git.openjdk.org/jdk/pull/25895 From iwalulya at openjdk.org Thu Jun 19 12:26:15 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 19 Jun 2025 12:26:15 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4] In-Reply-To: References: Message-ID: > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: Albert suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25832/files - new: https://git.openjdk.org/jdk/pull/25832/files/2d5b7cd1..df4f7ce5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=02-03 Stats: 31 lines in 3 files changed: 1 ins; 18 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From eastigeevich at openjdk.org Thu Jun 19 13:40:44 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 13:40:44 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v2] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: On Thu, 19 Jun 2025 11:39:56 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > fix windows build failures src/hotspot/share/utilities/growableArray.hpp line 198: > 196: } > 197: > 198: GrowableArrayIterator ncend() { I don't think `ncbegin` and `ncend` are good names. Why not to use `begin` and `end`? Also `GrowableArrayIterator` looks confusing because of the second parameter. Why not to use something like `GrowableArrayConstIterator`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157042350 From eastigeevich at openjdk.org Thu Jun 19 14:05:42 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 14:05:42 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v2] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: On Thu, 19 Jun 2025 11:39:56 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > fix windows build failures > The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Maybe instead of introducing a generalized version of the insertion sort, you can have a function implementing the insertion sort in the context of JDK-8357186? This specialized function will be much smaller than the PR changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988165498 From qamai at openjdk.org Thu Jun 19 14:07:50 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 14:07:50 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v2] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: <6OYGIv6jfwZHzsKDS7RINqCq-uUidAq6e94Rz5bI1w8=.7d2a56c6-8a10-44ac-9b13-a77ff6362e5b@github.com> On Thu, 19 Jun 2025 13:45:37 GMT, Evgeny Astigeevich wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> fix windows build failures > >> The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. > > Maybe instead of introducing a generalized version of the insertion sort, you can have a function implementing the insertion sort in the context of JDK-8357186? This specialized function will be much smaller than the PR changes. @eastig Thanks a lot for your reviews. Yes a specialized insertion function could work, but a generalized function would be more useful and easier to test. A large part of this change is to modernize `GrowableArrayIterator`, and the actual insertion sort is pretty small (only about 40 LOC). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988227557 From qamai at openjdk.org Thu Jun 19 14:12:39 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 14:12:39 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v2] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: On Thu, 19 Jun 2025 13:36:53 GMT, Evgeny Astigeevich wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> fix windows build failures > > src/hotspot/share/utilities/growableArray.hpp line 198: > >> 196: } >> 197: >> 198: GrowableArrayIterator ncend() { > > I don't think `ncbegin` and `ncend` are good names. Why not to use `begin` and `end`? > Also `GrowableArrayIterator` looks confusing because of the second parameter. > Why not to use something like `GrowableArrayConstIterator`? Because it would be an incompatible change, there are places where we do things like GrowableArrayIterator it = array.begin(); I think your latter concern can be addressed by `using GrowableArrayNonConstIterator`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157121474 From qamai at openjdk.org Thu Jun 19 14:27:06 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 14:27:06 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: > Hi, > > This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. > > In addition, I make some improvements to `GrowableArrayIterator`: > > - Make a non-const variant (our current iterator is const only). > - Add various utility operators to align with a typical iterator. > > [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. > > Please take a look and share your thoughts. Thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: GrowableArrayNonConstIterator ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25895/files - new: https://git.openjdk.org/jdk/pull/25895/files/32f48e21..ef934f6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=01-02 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25895.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895 PR: https://git.openjdk.org/jdk/pull/25895 From eastigeevich at openjdk.org Thu Jun 19 14:55:50 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 14:55:50 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v2] In-Reply-To: <6OYGIv6jfwZHzsKDS7RINqCq-uUidAq6e94Rz5bI1w8=.7d2a56c6-8a10-44ac-9b13-a77ff6362e5b@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6OYGIv6jfwZHzsKDS7RINqCq-uUidAq6e94Rz5bI1w8=.7d2a56c6-8a10-44ac-9b13-a77ff6362e5b@github.com> Message-ID: On Thu, 19 Jun 2025 14:05:46 GMT, Quan Anh Mai wrote: > ... the actual insertion sort is pretty small (only about 40 LOC). This is my point. If amount of changes is so small and there is only one use case, why do we need the type independent implementation? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988400271 From eastigeevich at openjdk.org Thu Jun 19 15:00:49 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 15:00:49 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v2] In-Reply-To: <6OYGIv6jfwZHzsKDS7RINqCq-uUidAq6e94Rz5bI1w8=.7d2a56c6-8a10-44ac-9b13-a77ff6362e5b@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6OYGIv6jfwZHzsKDS7RINqCq-uUidAq6e94Rz5bI1w8=.7d2a56c6-8a10-44ac-9b13-a77ff6362e5b@github.com> Message-ID: <6Aoq3WPNjp6I4f-A2IGBTN_hU9Z4er_IUyXufAR0b2c=.c2039cd4-b8b3-42ff-b0e8-8dfefc73a467@github.com> On Thu, 19 Jun 2025 14:05:46 GMT, Quan Anh Mai wrote: > a generalized function would be more useful and easier to test. For whom would it be more useful? Usually generalized versions come from a set of specialized versions. Not vice versa. There is no issue to test a specialized version. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988408679 From qamai at openjdk.org Thu Jun 19 15:10:28 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 15:10:28 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: On Thu, 19 Jun 2025 14:27:06 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > GrowableArrayNonConstIterator Firstly, since the generalized function is no more complex than the specialized function, so why not go for the generalized function and save us the troubles generalizing the specialized function if the need arises. Secondly, it is much much easier to test the generalized function. I can easily verify that sorting an array of `int` is correct but I cannot verify that the sort of an array of `SigEntry`s is correct, especially when I am also modifying the compare function. Additionally, a stable sort is needed because it is non-trivial to obtain the desired effect with a non-stable sort. At the same time, I can easily make a `TwoInt` that is compared by `val` and `idx` stores the index in the original array. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988441177 From eastigeevich at openjdk.org Thu Jun 19 15:30:27 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 15:30:27 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: <8fsL_z6m0doeGofbkTjCQ0elPkNIhShAj57w8Z69iLs=.285d5feb-bf45-4b73-aa2e-af200ae236e1@github.com> On Thu, 19 Jun 2025 15:07:44 GMT, Quan Anh Mai wrote: > ...save us the troubles generalizing the specialized function if the need arises. Unfortunately experience teaches us, code written for purpose of future uses is never used as is. These many years of the project, there have not been any needs for a stable sort. Your case is the first one. You cannot predict other cases. If you want the insertion sort, I'd recommend to have it in `GrowableArrayView`: `GrowableArrayView<>::insertion_sort()`. In this case you will not have the issue with iterators names which is a big issue from my point of view. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988492649 From eastigeevich at openjdk.org Thu Jun 19 15:38:29 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 15:38:29 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: On Thu, 19 Jun 2025 14:27:06 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > GrowableArrayNonConstIterator Here the simple implementation which does not require a lot of code: ```c++ void GrowableArrayView<>::insertion_sort(int f(E*, E*)) { if (_data == nullptr) return; for (int i = 1; i < length(); i++) { E key = _data[i]; int j = i - 1; while (j >= 0 && f(_data[j], key)) { _data[j + 1] = _data[j]; j--; } _data[j + 1] = key; } } ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988517979 From qamai at openjdk.org Thu Jun 19 15:52:27 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 15:52:27 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: <8fsL_z6m0doeGofbkTjCQ0elPkNIhShAj57w8Z69iLs=.285d5feb-bf45-4b73-aa2e-af200ae236e1@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <8fsL_z6m0doeGofbkTjCQ0elPkNIhShAj57w8Z69iLs=.285d5feb-bf45-4b73-aa2e-af200ae236e1@github.com> Message-ID: On Thu, 19 Jun 2025 15:28:02 GMT, Evgeny Astigeevich wrote: > These many years of the project, there have not been any needs for a stable sort. Just do a `grep -rn ./src/hotspot -e "stable sort"` you can find: https://github.com/openjdk/jdk/blob/c4fb00a7be51c7a05a29d3d57d787feb5c698ddf/src/hotspot/share/classfile/fieldLayoutBuilder.hpp#L106 > If you want the insertion sort, I'd recommend to have it in `GrowableArrayView`: `GrowableArrayView<>::insertion_sort()`. We are programming in C++, I think it would be better to follow the C++ convention. The practical reason is that it prevents users not wanting to sort from having to include the sort functionality. > In this case you will not have the issue with iterators names which is a big issue from my point of view. Now we have `GrowableArrayIterator` and `GrowableArrayNonConstIterator`, what is the issue with them? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988548565 From qamai at openjdk.org Thu Jun 19 16:17:28 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 16:17:28 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> On Thu, 19 Jun 2025 14:27:06 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > GrowableArrayNonConstIterator Note that insertion sort is the most efficient sorting algorithm for small arrays, so we can use it for non-stable sort as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988617584 From eastigeevich at openjdk.org Thu Jun 19 16:25:28 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 16:25:28 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <8fsL_z6m0doeGofbkTjCQ0elPkNIhShAj57w8Z69iLs=.285d5feb-bf45-4b73-aa2e-af200ae236e1@github.com> Message-ID: On Thu, 19 Jun 2025 15:47:51 GMT, Quan Anh Mai wrote: > > These many years of the project, there have not been any needs for a stable sort. > > Just do a `grep -rn ./src/hotspot -e "stable sort"` you can find: > > https://github.com/openjdk/jdk/blob/c4fb00a7be51c7a05a29d3d57d787feb5c698ddf/src/hotspot/share/classfile/fieldLayoutBuilder.hpp#L106 > Ok. Do we need to rewrite this code to use a stable sort? > > If you want the insertion sort, I'd recommend to have it in `GrowableArrayView`: `GrowableArrayView<>::insertion_sort()`. > > We are programming in C++, I think it would be better to follow the C++ convention. The practical reason is that it prevents users not wanting to sort from having to include the sort functionality. Yes, we use C++ but we use subset of it: https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md IMO 10 LoC is always better than 50+ LoC. We don't write library code which uses most of C++ features. If we can solve a problem with less code which looks like C code, let's use such a solution. We don't have a goal to use as many C++ features as possible, especially approaches used in STL. > > > In this case you will not have the issue with iterators names which is a big issue from my point of view. > > Now we have `GrowableArrayIterator` and `GrowableArrayNonConstIterator`, what is the issue with them? One issue is the pollution of the global namespace by the name which rarely be used. Another is that this is opposite to what C++ programmers are familiar with: iterator and const_iterator. It's already confusing: users of `GrowableArrayIterator` might expect it to be non-constant. So instead of fixing this confusion by getting close to C++ standards, we are diverging more from them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988632237 From jsjolen at openjdk.org Thu Jun 19 16:34:30 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 19 Jun 2025 16:34:30 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> Message-ID: <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> On Thu, 19 Jun 2025 16:15:08 GMT, Quan Anh Mai wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> GrowableArrayNonConstIterator > > Note that insertion sort is the most efficient sorting algorithm for small arrays, so we can use it for non-stable sort as well. Hi @merykitty , Thank you for taking the effort to produce tooling for everyone when you found a need for it yourself. Often, we have useful datatypes hidden away into internals that we'd like to use, or we simply do other solutions because our preferred solution is missing. Unfortunately, I think that the ceremony required to get your insertion sort working for someone else's type will put other devs off from using it. None, AFAIK, of our datatypes are compatible with the STL's interfaces. If we take this definition: void GrowableArrayView<>::insertion_sort(int f(E*, E*)) { if (_data == nullptr) return; for (int i = 1; i < length(); i++) { E key = _data[i]; int j = i - 1; while (j >= 0 && f(_data[j], key)) { _data[j + 1] = _data[j]; j--; } _data[j + 1] = key; } } And change it around a bit: template void insertion_sort(T* array, size_t length, C comparator) { for (int i = 1; i < length; i++) { T key = array[i]; // Should it really copy??? int j = i - 1; while (j >= 0 && comparator(array[j], key)) { array[j + 1] = array[j]; j--; } array[j + 1] = key; } } Then I think we have something general-ish. For stable_sort we then do: GrowableArray::stable_sort(C comparator) { insertion_sort(_data, length(), comparator); } This is going to be general enough, for most of our cases we have a contiguous array with a size of some fixed element type which we want to change in-place. This is sufficient for expressing that. This also fits well with the `QuickSort` class that we have, it's the same type of interface. Maybe the `quickSort.hpp` should be renamed into `sort.hpp` and `InsertionSort` be a class in there? I'm not sure what the style guide thinks about that, but I think it's a good idea :-). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988650597 From eastigeevich at openjdk.org Thu Jun 19 16:52:29 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 16:52:29 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> Message-ID: On Thu, 19 Jun 2025 16:31:52 GMT, Johan Sj?len wrote: >> Note that insertion sort is the most efficient sorting algorithm for small arrays, so we can use it for non-stable sort as well. > > Hi @merykitty , > > Thank you for taking the effort to produce tooling for everyone when you found a need for it yourself. Often, we have useful datatypes hidden away into internals that we'd like to use, or we simply do other solutions because our preferred solution is missing. > > Unfortunately, I think that the ceremony required to get your insertion sort working for someone else's type will put other devs off from using it. None, AFAIK, of our datatypes are compatible with the STL's interfaces. > > If we take this definition: > > > void GrowableArrayView<>::insertion_sort(int f(E*, E*)) { > if (_data == nullptr) return; > for (int i = 1; i < length(); i++) { > E key = _data[i]; > int j = i - 1; > while (j >= 0 && f(_data[j], key)) { > _data[j + 1] = _data[j]; > j--; > } > _data[j + 1] = key; > } > } > > > And change it around a bit: > > > template > void insertion_sort(T* array, size_t length, C comparator) { > for (int i = 1; i < length; i++) { > T key = array[i]; // Should it really copy??? > int j = i - 1; > while (j >= 0 && comparator(array[j], key)) { > array[j + 1] = array[j]; > j--; > } > array[j + 1] = key; > } > } > > > Then I think we have something general-ish. > For stable_sort we then do: > > > GrowableArray::stable_sort(C comparator) { > insertion_sort(_data, length(), comparator); > } > > > This is going to be general enough, for most of our cases we have a contiguous array with a size of some fixed element type which we want to change in-place. This is sufficient for expressing that. > > This also fits well with the `QuickSort` class that we have, it's the same type of interface. Maybe the `quickSort.hpp` should be renamed into `sort.hpp` and `InsertionSort` be a class in there? I'm not sure what the style guide thinks about that, but I think it's a good idea :-). @jdksjolen, GrowableArray::stable_sort(C comparator) { insertion_sort(_data, length(), comparator); } It has a potential problem with O(n^2). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988686419 From qamai at openjdk.org Thu Jun 19 17:15:22 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 17:15:22 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v4] In-Reply-To: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> > Hi, > > This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. > > In addition, I make some improvements to `GrowableArrayIterator`: > > - Make a non-const variant (our current iterator is const only). > - Add various utility operators to align with a typical iterator. > > [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. > > Please take a look and share your thoughts. Thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: give up on RandomIt ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25895/files - new: https://git.openjdk.org/jdk/pull/25895/files/ef934f6e..f13d739e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=02-03 Stats: 123 lines in 3 files changed: 17 ins; 71 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/25895.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895 PR: https://git.openjdk.org/jdk/pull/25895 From qamai at openjdk.org Thu Jun 19 17:20:27 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 19 Jun 2025 17:20:27 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> Message-ID: On Thu, 19 Jun 2025 16:31:52 GMT, Johan Sj?len wrote: >> Note that insertion sort is the most efficient sorting algorithm for small arrays, so we can use it for non-stable sort as well. > > Hi @merykitty , > > Thank you for taking the effort to produce tooling for everyone when you found a need for it yourself. Often, we have useful datatypes hidden away into internals that we'd like to use, or we simply do other solutions because our preferred solution is missing. > > Unfortunately, I think that the ceremony required to get your insertion sort working for someone else's type will put other devs off from using it. None, AFAIK, of our datatypes are compatible with the STL's interfaces. > > If we take this definition: > > > void GrowableArrayView<>::insertion_sort(int f(E*, E*)) { > if (_data == nullptr) return; > for (int i = 1; i < length(); i++) { > E key = _data[i]; > int j = i - 1; > while (j >= 0 && f(_data[j], key)) { > _data[j + 1] = _data[j]; > j--; > } > _data[j + 1] = key; > } > } > > > And change it around a bit: > > > template > void insertion_sort(T* array, size_t length, C comparator) { > for (int i = 1; i < length; i++) { > T key = array[i]; // Should it really copy??? > int j = i - 1; > while (j >= 0 && comparator(array[j], key)) { > array[j + 1] = array[j]; > j--; > } > array[j + 1] = key; > } > } > > > Then I think we have something general-ish. > For stable_sort we then do: > > > GrowableArray::stable_sort(C comparator) { > insertion_sort(_data, length(), comparator); > } > > > This is going to be general enough, for most of our cases we have a contiguous array with a size of some fixed element type which we want to change in-place. This is sufficient for expressing that. > > This also fits well with the `QuickSort` class that we have, it's the same type of interface. Maybe the `quickSort.hpp` should be renamed into `sort.hpp` and `InsertionSort` be a class in there? I'm not sure what the style guide thinks about that, but I think it's a good idea :-). @jdksjolen Thanks for your suggestion. Actually, I can make it so that a `T*` will satisfy `RandomIt` but there is no need for `RandomIt` right now. It is unfortunate because an iterator will give us more safety net, though. I have reverted the `GrowableArrayIterator` changes. Insertion sort is good for small arrays so we can use it for `QuickSort::sort`, too. For a stable sort algorithm, implementing a merge - insertion sort should be the way, there is no need to rush for a stable sort method. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988752926 From jsjolen at openjdk.org Thu Jun 19 17:37:27 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 19 Jun 2025 17:37:27 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> Message-ID: On Thu, 19 Jun 2025 16:31:52 GMT, Johan Sj?len wrote: >> Note that insertion sort is the most efficient sorting algorithm for small arrays, so we can use it for non-stable sort as well. > > Hi @merykitty , > > Thank you for taking the effort to produce tooling for everyone when you found a need for it yourself. Often, we have useful datatypes hidden away into internals that we'd like to use, or we simply do other solutions because our preferred solution is missing. > > Unfortunately, I think that the ceremony required to get your insertion sort working for someone else's type will put other devs off from using it. None, AFAIK, of our datatypes are compatible with the STL's interfaces. > > If we take this definition: > > > void GrowableArrayView<>::insertion_sort(int f(E*, E*)) { > if (_data == nullptr) return; > for (int i = 1; i < length(); i++) { > E key = _data[i]; > int j = i - 1; > while (j >= 0 && f(_data[j], key)) { > _data[j + 1] = _data[j]; > j--; > } > _data[j + 1] = key; > } > } > > > And change it around a bit: > > > template > void insertion_sort(T* array, size_t length, C comparator) { > for (int i = 1; i < length; i++) { > T key = array[i]; // Should it really copy??? > int j = i - 1; > while (j >= 0 && comparator(array[j], key)) { > array[j + 1] = array[j]; > j--; > } > array[j + 1] = key; > } > } > > > Then I think we have something general-ish. > For stable_sort we then do: > > > GrowableArray::stable_sort(C comparator) { > insertion_sort(_data, length(), comparator); > } > > > This is going to be general enough, for most of our cases we have a contiguous array with a size of some fixed element type which we want to change in-place. This is sufficient for expressing that. > > This also fits well with the `QuickSort` class that we have, it's the same type of interface. Maybe the `quickSort.hpp` should be renamed into `sort.hpp` and `InsertionSort` be a class in there? I'm not sure what the style guide thinks about that, but I think it's a good idea :-). > @jdksjolen Thanks for your suggestion. Actually, I can make it so that a `T*` will satisfy `RandomIt` but there is no need for `RandomIt` right now. It is unfortunate because an iterator will give us more safety net, though. > > I have reverted the `GrowableArrayIterator` changes. Insertion sort is good for small arrays so we can use it for `QuickSort::sort`, too. For a stable sort algorithm, implementing a merge - insertion sort should be the way, there is no need to rush for a stable sort method. Cheers! This looks good to me. Let's see what the rest of the community thinks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2988777575 From cslucas at openjdk.org Thu Jun 19 18:28:29 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 19 Jun 2025 18:28:29 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v4] In-Reply-To: <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> Message-ID: On Thu, 19 Jun 2025 17:15:22 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > give up on RandomIt Just some drive-by comments. src/hotspot/share/utilities/sort.hpp line 37: > 35: public: > 36: template > 37: static void sort(T* data, int size, Compare comp) { `int size` to `size_t size` ? or at least `unsigned int`. src/hotspot/share/utilities/sort.hpp line 39: > 37: static void sort(T* data, int size, Compare comp) { > 38: if (size == 0) { > 39: // Empty array NIT: useless comment. src/hotspot/share/utilities/sort.hpp line 57: > 55: // backward) > 56: T* prev = pos - 1; > 57: if (comp(*prev, current_elem) <= 0) { NIT: would be better to pass pointers here? test/hotspot/gtest/utilities/test_sort.cpp line 25: > 23: */ > 24: > 25: #include "runtime/os.hpp" NIT: sort the imports? ------------- PR Review: https://git.openjdk.org/jdk/pull/25895#pullrequestreview-2943777844 PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157486802 PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157487754 PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157490818 PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157491825 From eastigeevich at openjdk.org Thu Jun 19 22:36:30 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 19 Jun 2025 22:36:30 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> Message-ID: On Thu, 19 Jun 2025 17:18:12 GMT, Quan Anh Mai wrote: >> Hi @merykitty , >> >> Thank you for taking the effort to produce tooling for everyone when you found a need for it yourself. Often, we have useful datatypes hidden away into internals that we'd like to use, or we simply do other solutions because our preferred solution is missing. >> >> Unfortunately, I think that the ceremony required to get your insertion sort working for someone else's type will put other devs off from using it. None, AFAIK, of our datatypes are compatible with the STL's interfaces. >> >> If we take this definition: >> >> >> void GrowableArrayView<>::insertion_sort(int f(E*, E*)) { >> if (_data == nullptr) return; >> for (int i = 1; i < length(); i++) { >> E key = _data[i]; >> int j = i - 1; >> while (j >= 0 && f(_data[j], key)) { >> _data[j + 1] = _data[j]; >> j--; >> } >> _data[j + 1] = key; >> } >> } >> >> >> And change it around a bit: >> >> >> template >> void insertion_sort(T* array, size_t length, C comparator) { >> for (int i = 1; i < length; i++) { >> T key = array[i]; // Should it really copy??? >> int j = i - 1; >> while (j >= 0 && comparator(array[j], key)) { >> array[j + 1] = array[j]; >> j--; >> } >> array[j + 1] = key; >> } >> } >> >> >> Then I think we have something general-ish. >> For stable_sort we then do: >> >> >> GrowableArray::stable_sort(C comparator) { >> insertion_sort(_data, length(), comparator); >> } >> >> >> This is going to be general enough, for most of our cases we have a contiguous array with a size of some fixed element type which we want to change in-place. This is sufficient for expressing that. >> >> This also fits well with the `QuickSort` class that we have, it's the same type of interface. Maybe the `quickSort.hpp` should be renamed into `sort.hpp` and `InsertionSort` be a class in there? I'm not sure what the style guide thinks about that, but I think it's a good idea :-). > > @jdksjolen Thanks for your suggestion. Actually, I can make it so that a `T*` will satisfy `RandomIt` but there is no need for `RandomIt` right now. It is unfortunate because an iterator will give us more safety net, though. > > I have reverted the `GrowableArrayIterator` changes. Insertion sort is good for small arrays so we can use it for `QuickSort::sort`, too. For a stable sort algorithm, implementing a merge - insertion sort should be the way, there is no need to rush for a stable sort method. @merykitty, Out of curiosity, why not to use the classical implementations provided above? It's so simple and compact. It's also self documenting. It also has the minimum number of branches. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2989272978 From qamai at openjdk.org Fri Jun 20 03:05:22 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 20 Jun 2025 03:05:22 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v5] In-Reply-To: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: > Hi, > > This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. > > In addition, I make some improvements to `GrowableArrayIterator`: > > - Make a non-const variant (our current iterator is const only). > - Add various utility operators to align with a typical iterator. > > [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. > > Please take a look and share your thoughts. Thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25895/files - new: https://git.openjdk.org/jdk/pull/25895/files/f13d739e..7fc72da1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=03-04 Stats: 3 lines in 2 files changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25895.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895 PR: https://git.openjdk.org/jdk/pull/25895 From qamai at openjdk.org Fri Jun 20 03:05:24 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 20 Jun 2025 03:05:24 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v4] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> Message-ID: <2-6lbSM0y22WVEiOqLJ31lu8LkA-Ik1O4nr6eb1vpoo=.d87f5b56-7113-42a7-962a-94eb3c2ac1c7@github.com> On Thu, 19 Jun 2025 18:18:38 GMT, Cesar Soares Lucas wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> give up on RandomIt > > src/hotspot/share/utilities/sort.hpp line 37: > >> 35: public: >> 36: template >> 37: static void sort(T* data, int size, Compare comp) { > > `int size` to `size_t size` ? or at least `unsigned int`. Hotspot container usually uses signed int for size. So I think `int` here is a sensible choice. > src/hotspot/share/utilities/sort.hpp line 57: > >> 55: // backward) >> 56: T* prev = pos - 1; >> 57: if (comp(*prev, current_elem) <= 0) { > > NIT: would be better to pass pointers here? A `comp` usually receives references. Practically, it is almost the same as receiving pointers. > test/hotspot/gtest/utilities/test_sort.cpp line 25: > >> 23: */ >> 24: >> 25: #include "runtime/os.hpp" > > NIT: sort the imports? I have cleaned up the unused import here. What do you mean by sorting the imports? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157947448 PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157948428 PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2157948919 From qamai at openjdk.org Fri Jun 20 03:09:00 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 20 Jun 2025 03:09:00 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v6] In-Reply-To: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: > Hi, > > This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. > > In addition, I make some improvements to `GrowableArrayIterator`: > > - Make a non-const variant (our current iterator is const only). > - Add various utility operators to align with a typical iterator. > > [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. > > Please take a look and share your thoughts. Thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25895/files - new: https://git.openjdk.org/jdk/pull/25895/files/7fc72da1..65bd14db Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25895.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895 PR: https://git.openjdk.org/jdk/pull/25895 From qamai at openjdk.org Fri Jun 20 03:21:35 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 20 Jun 2025 03:21:35 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> Message-ID: On Thu, 19 Jun 2025 22:31:09 GMT, Evgeny Astigeevich wrote: >> @jdksjolen Thanks for your suggestion. Actually, I can make it so that a `T*` will satisfy `RandomIt` but there is no need for `RandomIt` right now. It is unfortunate because an iterator will give us more safety net, though. >> >> I have reverted the `GrowableArrayIterator` changes. Insertion sort is good for small arrays so we can use it for `QuickSort::sort`, too. For a stable sort algorithm, implementing a merge - insertion sort should be the way, there is no need to rush for a stable sort method. > > @merykitty, > Out of curiosity, why not to use the classical implementations provided above? It's so simple and compact. It's also self documenting. It also has the minimum number of branches. @eastig It is almost the same, isn't it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2989684018 From lliu at openjdk.org Fri Jun 20 07:06:31 2025 From: lliu at openjdk.org (Liming Liu) Date: Fri, 20 Jun 2025 07:06:31 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Thu, 5 Jun 2025 07:15:34 GMT, Liming Liu wrote: >> This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. >> >> 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. >> >> 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. >> >> The performance regressions and improvements were measured with the following microbenchmarks: >> org.openjdk.bench.java.util.TestCRC32.testCRC32Update >> org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate >> >> Ran the following JTReg tests on Ampere1 and did not find problems: >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java > > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Add the message for the assertions Ping. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25609#issuecomment-2990046033 From aph at openjdk.org Fri Jun 20 08:29:30 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 20 Jun 2025 08:29:30 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Fri, 20 Jun 2025 07:04:05 GMT, Liming Liu wrote: > Ping. When you reply to my last point. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25609#issuecomment-2990254076 From aph at openjdk.org Fri Jun 20 08:33:29 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 20 Jun 2025 08:33:29 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v6] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: On Fri, 20 Jun 2025 03:09:00 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > comment src/hotspot/share/utilities/sort.hpp line 52: > 50: T* pos = current; > 51: while (pos > begin) { > 52: // Since the sort is stable, we must insert the current element at the first location at Suggestion: // Because the sort is stable, we must insert the current element at the first location at ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2158346057 From aph at openjdk.org Fri Jun 20 08:36:30 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 20 Jun 2025 08:36:30 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v4] In-Reply-To: <2-6lbSM0y22WVEiOqLJ31lu8LkA-Ik1O4nr6eb1vpoo=.d87f5b56-7113-42a7-962a-94eb3c2ac1c7@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> <2-6lbSM0y22WVEiOqLJ31lu8LkA-Ik1O4nr6eb1vpoo=.d87f5b56-7113-42a7-962a-94eb3c2ac1c7@github.com> Message-ID: On Fri, 20 Jun 2025 03:01:39 GMT, Quan Anh Mai wrote: >> test/hotspot/gtest/utilities/test_sort.cpp line 25: >> >>> 23: */ >>> 24: >>> 25: #include "runtime/os.hpp" >> >> NIT: sort the imports? > > I have cleaned up the unused import here. What do you mean by sorting the imports? Sort the "#include" lines alphabetically. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2158355607 From aph at openjdk.org Fri Jun 20 08:39:30 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 20 Jun 2025 08:39:30 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> Message-ID: On Thu, 19 Jun 2025 17:34:35 GMT, Johan Sj?len wrote: > Cheers! This looks good to me. Let's see what the rest of the community thinks. I'm happy to approve it with a few minor changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2990286664 From lliu at openjdk.org Fri Jun 20 08:45:30 2025 From: lliu at openjdk.org (Liming Liu) Date: Fri, 20 Jun 2025 08:45:30 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Fri, 20 Jun 2025 08:26:40 GMT, Andrew Haley wrote: > > Ping. > > When you reply to my last point. Does this mean changing the title? But I don't get how does the patch help Apple, since the patch does not effect the default behavior on Apple. There would be changes when enabling UseCryptoPmullForCRC32 for 383 bytes or smaller. So, are the improvements from this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25609#issuecomment-2990300514 From aph at openjdk.org Fri Jun 20 08:53:31 2025 From: aph at openjdk.org (Andrew Haley) Date: Fri, 20 Jun 2025 08:53:31 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU [v4] In-Reply-To: References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Fri, 20 Jun 2025 08:41:56 GMT, Liming Liu wrote: > > > Ping. > > > > > > When you reply to my last point. > > Does this mean changing the title? But I don't get how does the patch help Apple, since the patch does not effect the default behavior on Apple. There would be changes when enabling UseCryptoPmullForCRC32 for 383 bytes or smaller. So, are the improvements from this? This PR does not only enable crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU, it also improves the algorithm for some cases of short arrays. I made a suggestion for a title that accurately describes this PR. Feel free to write your own accurate title, or use mine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25609#issuecomment-2990325561 From mdoerr at openjdk.org Fri Jun 20 09:01:29 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 20 Jun 2025 09:01:29 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v8] In-Reply-To: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> References: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> Message-ID: On Tue, 17 Jun 2025 20:59:46 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > 2nd try at arm fix > > Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance. > > For ZGC I am using a per-nmethod lock: ZLocker locker(ZNMethod::lock_for_nmethod(nm)); Ah, right. So, ZGC should be fine. > I don't know what benchmarks to run to check the performance for functions like Deoptimization::deoptimize_all_marked, so I welcome any help with this. I have tried some SPEC benchmarks with G1 on PPC64, but couldn't observe a regression. (If there is one, it was below noise.) > One possible optimization that might help is skipping the lock if the make_not_entrant call is done during a safepoint. I guess the most critical scenario is when many Java threads need to disarm a large number of nmethod entry barriers. That doesn't happen at a safepoint. Not sure if other scenarios are worth optimizing by this idea. I guess this PR is ok as it is. Maybe other reviewers have more comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2990348628 From kbarrett at openjdk.org Fri Jun 20 09:08:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 09:08:35 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: <2ddvhqT5lC6N6SZ80TDN1HDfHjcrFYL7aLInjMZw9Po=.2c3c7b91-ed39-4420-9b90-33580a5d9a51@github.com> <-M2-0zU6PnzfWIcFizGXyg9xeVGVrJVRvJVkHDy1-w0=.0702833c-7ed6-41bf-8c49-dc0bb2e1503b@github.com> <8_CoRfbkgRom4_8wkrguW6tldAsW1L3EZus3tchTElM=.3bddb374-5d98-4f7b-9d9c-5f413b3c23aa@github.com> Message-ID: On Thu, 19 Jun 2025 01:20:42 GMT, David Holmes wrote: >>> But that is not correct - we should only take this "overflow" path for >>> `((k+n) > 0x7fe && (k+n) <= INT_MAX)`. Your suggestion makes us take this >>> path if `(k+n)` overflows to negative. ?? >> >> It is intentional that the new test is true for the `(k+n)` => overflow case. >> It fully handles the overflow case, eliminating the need for the later fixup >> of the case where `((k <= -54) && (n > 5000))` (though `(n > 0)` would work >> just as well; I don't know why that `5000` was inserted). That fixup returned >> the same `hugeX`-based result as here. >> >>> It can be negative if n is positive too. >> >> `(k+n)` cannot be negative with `n` positive here, even under wrapping >> semantics, because we can't get here in that case due to the prior overflow >> detection. > > Okay @kimbarrett I suggest that you take over this issue and PR. I do not have the knowledge of the code that you have and I cannot affirm that your statements are correct. For the record, my suggestion above isn't quite right. It doesn't properly handle subnormal x. I forgot that k (before adding n) can be negative in the subnormal case. So the correct test of u_k is not ((u_k > 0x7fe) && (n > 0)) but rather ((u_k > 0x7fe) && ((k|n) > 0)) I'll be putting more details in JBS. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25656#discussion_r2158422212 From qamai at openjdk.org Fri Jun 20 11:02:25 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 20 Jun 2025 11:02:25 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v7] In-Reply-To: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: > Hi, > > This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. > > In addition, I make some improvements to `GrowableArrayIterator`: > > - Make a non-const variant (our current iterator is const only). > - Add various utility operators to align with a typical iterator. > > [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. > > Please take a look and share your thoughts. Thanks very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: small changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25895/files - new: https://git.openjdk.org/jdk/pull/25895/files/65bd14db..17f30c0d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25895&range=05-06 Stats: 3 lines in 2 files changed: 1 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25895.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895 PR: https://git.openjdk.org/jdk/pull/25895 From qamai at openjdk.org Fri Jun 20 11:02:26 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 20 Jun 2025 11:02:26 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v4] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> <2-6lbSM0y22WVEiOqLJ31lu8LkA-Ik1O4nr6eb1vpoo=.d87f5b56-7113-42a7-962a-94eb3c2ac1c7@github.com> Message-ID: On Fri, 20 Jun 2025 08:33:36 GMT, Andrew Haley wrote: >> I have cleaned up the unused import here. What do you mean by sorting the imports? > > Sort the "#include" lines alphabetically. I assume you want to have `unittest.hpp` above the `utilities` files. I have done that. I was confused because the convention in this area is pretty blurry as many files have the `unittest.hpp` as their last include. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2158685642 From qamai at openjdk.org Fri Jun 20 11:02:26 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Fri, 20 Jun 2025 11:02:26 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v3] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <6CKaFhgY21Nc-1J5gmtvSRUywqPUQ6ByFzn4qVMVluM=.7699ed42-41d3-466c-8537-cea6d42f56a2@github.com> <3D8sbixmaGgu6de-YosfXDmunMKZJ1MbTeKR4BOv_NY=.e2122224-b4cb-436d-bcd1-9251bbc47f19@github.com> Message-ID: On Fri, 20 Jun 2025 08:36:31 GMT, Andrew Haley wrote: >>> @jdksjolen Thanks for your suggestion. Actually, I can make it so that a `T*` will satisfy `RandomIt` but there is no need for `RandomIt` right now. It is unfortunate because an iterator will give us more safety net, though. >>> >>> I have reverted the `GrowableArrayIterator` changes. Insertion sort is good for small arrays so we can use it for `QuickSort::sort`, too. For a stable sort algorithm, implementing a merge - insertion sort should be the way, there is no need to rush for a stable sort method. >> >> Cheers! This looks good to me. Let's see what the rest of the community thinks. > >> Cheers! This looks good to me. Let's see what the rest of the community thinks. > > I'm happy to approve it with a few minor changes. Thanks for the reviews @theRealAph , I have addressed them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25895#issuecomment-2990984665 From coleenp at openjdk.org Fri Jun 20 12:00:15 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 12:00:15 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v2] In-Reply-To: References: Message-ID: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/share/prims/jvmtiRedefineClasses.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25870/files - new: https://git.openjdk.org/jdk/pull/25870/files/37b5c7ee..c3d8d0fc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25870.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25870/head:pull/25870 PR: https://git.openjdk.org/jdk/pull/25870 From coleenp at openjdk.org Fri Jun 20 12:00:16 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 12:00:16 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v2] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 06:24:03 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Update src/hotspot/share/prims/jvmtiRedefineClasses.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > > src/hotspot/share/classfile/stackMapTable.hpp line 154: > >> 152: SAME_FRAME = 64, >> 153: SAME_LOCALS_1_STACK_ITEM_FRAME = 128, >> 154: SAME_LOCALS_1_STACK_ITEM_EXTENDED = 247, > > I find these definitions a little confusing. SAME_FRAME is actually 0-63, with SAME_LOCALS_1_STACK_ITEM_FRAME being 64-127. Given many of these frame types imply tag ranges it may be clearer to define enum's for the start and end of ranges as applicable eg. > > enum { > SAME_FRAME_START = 0, > SAME_FRAME_END = 63, > SAME_LOCALS_1_STACK_ITEM_FRAME_START = 64, > SAME_LOCALS_1_STACK_ITEM_FRAME_END = 127, > RESERVED_START = 128, > RESERVED_END = 246, > SAME_LOCALS_1_STACK_ITEM_EXTENDED = 247, > CHOP_FRAME_START = 248, > CHOP_FRAME_END = 250, > SAME_FRAME_EXTENDED = 251, > APPEND_FRAME_START = 252, > APPEND_FRAME_END = 254, > FULL_FRAME = 255 > } > > and then adjust the code usage as appropriate e.g. > > if (frame_type <= SAME_FRAME_END) { > ... > if (frame_type <= SAME_LOCALS_1_STACK_ITEM_FRAME_END) { > if (_first) { > offset = frame_type - SAME_LOCALS_1_STACK_ITEM_FRAME_START; > ... > > What do you think? I wasn't really up for a big rewrite but having the complete set of names would be really good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25870#discussion_r2158789990 From coleenp at openjdk.org Fri Jun 20 12:49:12 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 12:49:12 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v3] In-Reply-To: References: Message-ID: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Apply David's suggested more complete and accurate frame type names. Being careful with >=s. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25870/files - new: https://git.openjdk.org/jdk/pull/25870/files/c3d8d0fc..107382e4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=01-02 Stats: 35 lines in 3 files changed: 10 ins; 0 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/25870.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25870/head:pull/25870 PR: https://git.openjdk.org/jdk/pull/25870 From coleenp at openjdk.org Fri Jun 20 13:35:29 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 13:35:29 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v8] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Tue, 10 Jun 2025 18:34:00 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400810} 'baz2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.baz2(ExceptionsTest.java:142) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:135) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 0 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar2" at BCI: 6 >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 8 for thread 0x000074c46402c7b0 (main) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:137) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) >> [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) >> [0.038s][info][exceptions ] Exception >> [ ] thrown in interpreter method <{method} {0x000074c408400670} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 0 for thread 0x000074c46402c7b0 (m... > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora comments -- removed printing of output.getStdout() from test Changes requested by coleenp (Reviewer). src/hotspot/share/utilities/exceptions.cpp line 619: > 617: // We don't want to use an OopHandle, or else we may prevent this object from being collected. > 618: // Whenever a GC happens, this will be cleared by Exceptions::clear_logging_cache(). > 619: static oop _last_logged_exception; oh gosh I don't like this at all. Save the exception string if anything. ------------- PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2946209210 PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2159014991 From eastigeevich at openjdk.org Fri Jun 20 13:44:29 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 20 Jun 2025 13:44:29 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v7] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> Message-ID: On Fri, 20 Jun 2025 11:02:25 GMT, Quan Anh Mai wrote: >> Hi, >> >> This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is [JDK-8357186](https://bugs.openjdk.org/browse/JDK-8357186) in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well. >> >> In addition, I make some improvements to `GrowableArrayIterator`: >> >> - Make a non-const variant (our current iterator is const only). >> - Add various utility operators to align with a typical iterator. >> >> [JDK-8360032](https://bugs.openjdk.org/browse/JDK-8360032) is a follow-up work that will build a stable merge-insertion sort on top of this PR. >> >> Please take a look and share your thoughts. Thanks very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > small changes LGTM ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/25895#pullrequestreview-2946237927 From kvn at openjdk.org Fri Jun 20 14:15:30 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 20 Jun 2025 14:15:30 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Tue, 3 Jun 2025 11:01:06 GMT, Kim Barrett wrote: >> Please review this change to permit the use of `noexcept` under certain >> circumstances in HotSpot code. >> >> http://wg21.link/n3050 >> >> Testing: >> >> JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the >> conversion would look like. It will need to be brought up to current mainline, >> possibly with modifications. >> >> This is a modification of the Style Guide, so rough consensus among the >> HotSpot Group members is required to make this change. Only Group members >> should vote for approval (via the github PR), though reasoned objections or >> comments from anyone will be considered. A decision on this proposal will not >> be made before Friday 16-June-2025 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process >> to approve (click on Review Changes > Approve), rather than sending a "vote: >> yes" email reply that would be normal for a CFV. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > more dholmes Hi @kimbarrett Could you please answer my questions? Some statements are confusing for me. Do we need to update `Guide` about `throw()`? Its usage seems confusing based on bugs you pointed. doc/hotspot-style.md line 1118: > 1116: failure must be declared `noexcept`. > 1117: * All other uses of `noexcept` exception specifications are forbidden. > 1118: * `noexcept` expressions are forbidden. "argument-less form of `noexcept` are permitted" vs "`noexcept` expressions are forbidden". So what we should use? `noexcept()`? The example would be nice. doc/hotspot-style.md line 1140: > 1138: result. If an allocation function is not declared `noexcept` then the compiler > 1139: may elide that checking and handling for a `new` expression calling that > 1140: function. This implies that compiler may generate a `nullptr` check if `noexcept` is specified. Is it true? Is it static (during compilation) check or it can generate runtime check? We usually have explicit checks in such places to catch allocation failure. We are missing check in some places which may lead to crashes (reference through `nullptr`). Can compiler helps here? doc/hotspot-style.md line 1153: > 1151: HotSpot code can assume no exceptions will ever be thrown, even from functions > 1152: not declared `noexcept`. So HotSpot code doesn't ever need to check, either > 1153: with conditional exception specifications or with `noexcept` expressions. "doesn't ever need to check" - what check? We still need to have nullptr checks. Right? ------------- PR Review: https://git.openjdk.org/jdk/pull/25574#pullrequestreview-2946301888 PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2991789891 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159073407 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159079821 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159083710 From coleenp at openjdk.org Fri Jun 20 15:05:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 15:05:03 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v4] In-Reply-To: References: Message-ID: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix tags (running more tests) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25870/files - new: https://git.openjdk.org/jdk/pull/25870/files/107382e4..7d2e60c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25870.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25870/head:pull/25870 PR: https://git.openjdk.org/jdk/pull/25870 From kbarrett at openjdk.org Fri Jun 20 15:59:36 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 15:59:36 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Fri, 20 Jun 2025 14:03:39 GMT, Vladimir Kozlov wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> more dholmes > > doc/hotspot-style.md line 1118: > >> 1116: failure must be declared `noexcept`. >> 1117: * All other uses of `noexcept` exception specifications are forbidden. >> 1118: * `noexcept` expressions are forbidden. > > "argument-less form of `noexcept` are permitted" vs "`noexcept` expressions are forbidden". So what we should use? `noexcept()`? The example would be nice. The standard allows `noexcept` to be used in two different contexts. It can be used in a function's exception specification. It can instead be used in an ordinary expression. In either case it takes one argument expression, but what form that can take and how it is interpreted is quite different, depending on which of those contexts applies. Used in an exception specification, the argument must be a constant expression. So one can write a conditional exception specification, indicating that a function is non-throwing if the argument is true, and potentially throwing if false. One use-case for this is a generic container, where some operations may only be noexept if certain operations of the element type are noexcept. In this context, `noexcept` (without arguments) is equivalent to `noexcept(true)`, and also to the deprecated `throw()` (with no arguments). At this time I think we don't ever need arguments, and the no-arg abbreviation is sufficient. That might change if we move toward using standard library facilities (particularly containers), or make our own more like those in the standard. Used as an expression, the argument is an unevaluated expression (similar to some uses of sizeof and alignof). The result of the `noexcept` expression is a bool constant, and is true if the expression never throws (based on examining exception specifications of functions called in the expression), false otherwise. There isn't an abbreviated expression form without arguments, as that doesn't make sense. At this time I think we don't need `noexcept` expressions, and might never need them. (Note that the argument expression for a `noexcept` exception specification can include `noexcept` expressions. But since we're not currently allowing either, that doesn't matter.) > doc/hotspot-style.md line 1140: > >> 1138: result. If an allocation function is not declared `noexcept` then the compiler >> 1139: may elide that checking and handling for a `new` expression calling that >> 1140: function. > > This implies that compiler may generate a `nullptr` check if `noexcept` is specified. Is it true? Is it static (during compilation) check or it can generate runtime check? > > We usually have explicit checks in such places to catch allocation failure. We are missing check in some places which may lead to crashes (reference through `nullptr`). Can compiler helps here? An allocation function may be declared as non-throwing (via `noexcept`). If not declared non-throwing (so "potentially throwing", even though in a no-exceptions build environment it won't ever actually throw, and will instead probably terminate the program), the compiler's implementation of a using `new` expression can assume the allocation function will never return null, and does not need to generate any code to handle that possibility. If declared non-throwing, a using `new` expression must account for the possibility that the allocation function might return null. It needs to test the result of the allocation function. If it is null then the `new` expression must not call the constructor for the type and must return null as its result. (Of course, usual compiler optimizations apply, and the null handling code can be elided if the compiler can prove the allocation does not in fact ever return null, regardless of any exception specifications. That's probably pretty much never the case though.) Because of that, an allocation function that can return null *must* be declared non-throwing. And an allocation function that never returns null shouldn't be declared non-throwing, as that's just degrading performance for no reason. > doc/hotspot-style.md line 1153: > >> 1151: HotSpot code can assume no exceptions will ever be thrown, even from functions >> 1152: not declared `noexcept`. So HotSpot code doesn't ever need to check, either >> 1153: with conditional exception specifications or with `noexcept` expressions. > > "doesn't ever need to check" - what check? We still need to have nullptr checks. Right? If a `new` expression is calling a non-throwing variant then the caller of the `new` expression should be handling the possible null result. An assert is generally not the appropriate "handling". If one isn't prepared to handle a null result then don't call the variant that might return that. In our typical usage, if calling a potentially throwing variant, it will never return null, instead terminating the program. In such a case, there's no need for a null check by the caller, as we know that case is handled inside the call. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159297650 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159297774 PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159297894 From coleenp at openjdk.org Fri Jun 20 16:04:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 16:04:58 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v8] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix missing periods and nullptr. thanks Serguei. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/8339a6b5..50c036f4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=06-07 Stats: 3 lines in 3 files changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From kbarrett at openjdk.org Fri Jun 20 16:10:39 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 16:10:39 GMT Subject: RFR: 8346914: UB issue in scalbnA Message-ID: Please review this change that replaces uses of our scalbnA function with using the standard scalbn function. Removed scalbnA, and also copysignA. For more details, see first comment and JBS. Testing: mach5 tier1-6. Though from discussion in https://github.com/openjdk/jdk/pull/25656, it's hard to get to our uses of scalbn/scalbnA. Before removing it, I tested scalbnA via a gtest that is attached to the JBS issue. ------------- Commit messages: - update copyrights - remove unused twom54 constant - scalbnA => scalbn, remove copysignA Changes: https://git.openjdk.org/jdk/pull/25917/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25917&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346914 Stats: 67 lines in 4 files changed: 0 ins; 55 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25917.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25917/head:pull/25917 PR: https://git.openjdk.org/jdk/pull/25917 From kbarrett at openjdk.org Fri Jun 20 16:10:39 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 16:10:39 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 16:06:24 GMT, Kim Barrett wrote: > Please review this change that replaces uses of our scalbnA function with > using the standard scalbn function. Removed scalbnA, and also copysignA. > > For more details, see first comment and JBS. > > Testing: mach5 tier1-6. Though from discussion in > https://github.com/openjdk/jdk/pull/25656, it's hard to get to our uses of > scalbn/scalbnA. > > Before removing it, I tested scalbnA via a gtest that is attached to the JBS > issue. Long ago we had our own scalbn and copysign, because we couldn't get those functions from . They are C99 functions. For gcc/clang we were using C++98/03, which only includes C89 library functions. So gcc/clang version restricted them out. And MSVC++ didn't have them at all. Later, MSVC++ added them, without any version restriction since they didn't do Standard versions back then. This collided with ours, so we renamed ours. Later still we switched to C++14, which includes C99 library functions, so they are no longer version restricted by gcc/clang. scalbnA was recently being looked at because of possibile signed integer overflow UB. And testing showed that was indeed the case. But testing also showed that it was (as intended) pretty much compatible with scalbn. Since we can now get scalbn from , there's no reason to keep our own. We should just use scalbn. Our copysignA was similarly compatible with copysign, and was also only being used by scalbnA. So there's no reason to keep it either. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25917#issuecomment-2992154785 From kbarrett at openjdk.org Fri Jun 20 16:16:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 16:16:28 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Fri, 20 Jun 2025 14:13:19 GMT, Vladimir Kozlov wrote: > Do we need to update `Guide` about `throw()`? Its usage seems confusing based on bugs you pointed. `throw()` is a form of "dynamic exception specification", which this PR says are forbidden. Though as noted in the PR intro, current code is out of conformance with this, and we might update. (Or not, for backporting reasons. But I think we probaby should.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2992170488 From kbarrett at openjdk.org Fri Jun 20 16:20:29 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 16:20:29 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 16:06:52 GMT, Kim Barrett wrote: > Long ago we had our own scalbn and copysign, because we couldn't get those functions from . There may have been similar issues with the AIX compiler. But since the current AIX compiler is based on a reasonably modern version of clang, and we're using C++14, it should work there too. Though I forgot to ask the maintainers of the aix-ppc port to verify that. I'm sure they'll let me know if this doesn't work for them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25917#issuecomment-2992178767 From kbarrett at openjdk.org Fri Jun 20 16:26:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 16:26:32 GMT Subject: RFR: 8359923: Const accessors for the Deferred class In-Reply-To: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> References: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> Message-ID: <0yHG618NALvVK84ex3klj-5PIFJ7bMMcAhSH1Zlgw2c=.321601c8-926b-4c82-a401-d3b2504b4018@github.com> On Wed, 18 Jun 2025 13:28:43 GMT, Joel Sikstr?m wrote: > Hello, > > This RFE adds const accessors to the `Deferred` class. We plan on using this in a future patch in ZGC. Thanks! > > Testing: > * Currently running tier1-2 > * Works in an WIP ZGC patch I've asked that this change be backed out. I don't think this change should have been made. I've proposed a better solution for the use-case that led to making this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25874#issuecomment-2992190779 From bmaillard at openjdk.org Fri Jun 20 16:39:51 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Fri, 20 Jun 2025 16:39:51 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v3] In-Reply-To: References: Message-ID: <38Q5LqxUlyIBHyusziq461sTdVCBm0ZO3kcVB_u7I18=.11da0a73-7a4f-4c83-abeb-f12791ad7741@github.com> > This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. > > ### Testing > - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) > - [x] tier1-3, plus some internal testing > - [x] Manual testing with values known to previously cause undefined behavior > > Thanks! Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: 8356865: Change assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25834/files - new: https://git.openjdk.org/jdk/pull/25834/files/c8904a29..97f52b45 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25834&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25834&range=01-02 Stats: 3 lines in 1 file changed: 2 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25834.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25834/head:pull/25834 PR: https://git.openjdk.org/jdk/pull/25834 From bmaillard at openjdk.org Fri Jun 20 16:43:27 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Fri, 20 Jun 2025 16:43:27 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v3] In-Reply-To: References: <8nXpApdLxXidwKfFpcVbKjpYgOn5EfhUvKNQRKvv2o0=.252bc291-3219-4d77-9a4d-8fe75952c2f6@github.com> Message-ID: On Wed, 18 Jun 2025 19:42:07 GMT, Evgeny Astigeevich wrote: >> Thanks for the comments! >> >> I added the assert because the issue in the JBS mentioned a specific case where we ended up with negative values. >> >> Should I leave it like this, or rather convert it to a more specific check (ie. making sure that the `LogBytesPerLong - log2_esize` most significant bits are not used **before** shifting)? > > IMO your assert is obfuscating the overflow problem. > I think the assert should be before doing the shift. > It can be like: > > assert((fast_size_limit == 0) || (count_leading_zeros(fast_size_limit) > (LogBytesPerLong - log2_esize), "fast_size_limit (%d) overflow when shifted left by %d", fast_size_limit, (LogBytesPerLong - log2_esize)); Thanks for the tip, I made the requested changes! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2159359020 From kvn at openjdk.org Fri Jun 20 16:46:28 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 20 Jun 2025 16:46:28 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: <_8DB4If43Y1CAFDGw52in15ejHmNZ4Uc1YI4p1eccaA=.08537c54-408d-48f5-ba52-55578507c7b9@github.com> On Fri, 20 Jun 2025 15:56:42 GMT, Kim Barrett wrote: >> doc/hotspot-style.md line 1118: >> >>> 1116: failure must be declared `noexcept`. >>> 1117: * All other uses of `noexcept` exception specifications are forbidden. >>> 1118: * `noexcept` expressions are forbidden. >> >> "argument-less form of `noexcept` are permitted" vs "`noexcept` expressions are forbidden". So what we should use? `noexcept()`? The example would be nice. > > The standard allows `noexcept` to be used in two different contexts. It can be > used in a function's exception specification. It can instead be used in an > ordinary expression. In either case it takes one argument expression, but what > form that can take and how it is interpreted is quite different, depending on > which of those contexts applies. > > Used in an exception specification, the argument must be a constant > expression. So one can write a conditional exception specification, indicating > that a function is non-throwing if the argument is true, and potentially > throwing if false. One use-case for this is a generic container, where some > operations may only be noexept if certain operations of the element type are > noexcept. In this context, `noexcept` (without arguments) is equivalent to > `noexcept(true)`, and also to the deprecated `throw()` (with no arguments). At > this time I think we don't ever need arguments, and the no-arg abbreviation is > sufficient. That might change if we move toward using standard library > facilities (particularly containers), or make our own more like those in the > standard. > > Used as an expression, the argument is an unevaluated expression (similar to > some uses of sizeof and alignof). The result of the `noexcept` expression is a > bool constant, and is true if the expression never throws (based on examining > exception specifications of functions called in the expression), false > otherwise. There isn't an abbreviated expression form without arguments, as > that doesn't make sense. At this time I think we don't need `noexcept` > expressions, and might never need them. > > (Note that the argument expression for a `noexcept` exception specification > can include `noexcept` expressions. But since we're not currently allowing > either, that doesn't matter.) Got it. No `noexcept` in regular expressions. No `noexcept(foo())`. But we can use `foo() noexcept(true);` which is equivalent to ``foo() noexcept;` Do we have case for `noexcept(false);` in VM? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159362992 From sspitsyn at openjdk.org Fri Jun 20 16:47:30 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 20 Jun 2025 16:47:30 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v8] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 16:04:58 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix missing periods and nullptr. thanks Serguei. Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2946793683 From kvn at openjdk.org Fri Jun 20 16:56:30 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 20 Jun 2025 16:56:30 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Fri, 20 Jun 2025 15:56:47 GMT, Kim Barrett wrote: >> doc/hotspot-style.md line 1140: >> >>> 1138: result. If an allocation function is not declared `noexcept` then the compiler >>> 1139: may elide that checking and handling for a `new` expression calling that >>> 1140: function. >> >> This implies that compiler may generate a `nullptr` check if `noexcept` is specified. Is it true? Is it static (during compilation) check or it can generate runtime check? >> >> We usually have explicit checks in such places to catch allocation failure. We are missing check in some places which may lead to crashes (reference through `nullptr`). Can compiler helps here? > > An allocation function may be declared as non-throwing (via `noexcept`). > > If not declared non-throwing (so "potentially throwing", even though in a > no-exceptions build environment it won't ever actually throw, and will instead > probably terminate the program), the compiler's implementation of a using > `new` expression can assume the allocation function will never return null, > and does not need to generate any code to handle that possibility. > > If declared non-throwing, a using `new` expression must account for the > possibility that the allocation function might return null. It needs to test > the result of the allocation function. If it is null then the `new` expression > must not call the constructor for the type and must return null as its result. > (Of course, usual compiler optimizations apply, and the null handling code can > be elided if the compiler can prove the allocation does not in fact ever > return null, regardless of any exception specifications. That's probably > pretty much never the case though.) > > Because of that, an allocation function that can return null *must* be > declared non-throwing. And an allocation function that never returns null > shouldn't be declared non-throwing, as that's just degrading performance for > no reason. So we need to have explicit `nullptr` checks or `assert(p != nullprt,"")` in our code. For example, for next `new()` we can add `assert(p != nullprt,"")` because `AllocateHeap` will exit VM if it can't allocate memory. And we don't need `noexcept` here or we can use `noexcept(false)`. Right? ALWAYSINLINE void* operator new(size_t size, MemTag mem_tag) { return AllocateHeap(size, mem_tag); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159374542 From kvn at openjdk.org Fri Jun 20 16:59:29 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 20 Jun 2025 16:59:29 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Fri, 20 Jun 2025 15:56:52 GMT, Kim Barrett wrote: >> doc/hotspot-style.md line 1153: >> >>> 1151: HotSpot code can assume no exceptions will ever be thrown, even from functions >>> 1152: not declared `noexcept`. So HotSpot code doesn't ever need to check, either >>> 1153: with conditional exception specifications or with `noexcept` expressions. >> >> "doesn't ever need to check" - what check? We still need to have nullptr checks. Right? > > If a `new` expression is calling a non-throwing variant then the caller of the > `new` expression should be handling the possible null result. An assert is > generally not the appropriate "handling". If one isn't prepared to handle a > null result then don't call the variant that might return that. > > In our typical usage, if calling a potentially throwing variant, it will never > return null, instead terminating the program. In such a case, there's no need > for a null check by the caller, as we know that case is handled inside the > call. In short, `noexcept` will indicate that caller have to check for nullptr and handle it (yes, not assert). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159378337 From iklam at openjdk.org Fri Jun 20 17:13:08 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 20 Jun 2025 17:13:08 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v9] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400810} 'baz2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.baz2(ExceptionsTest.java:142) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:135) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar2" at BCI: 6 > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400748} 'bar2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 8 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar2(ExceptionsTest.java:137) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo2(ExceptionsTest.java:127) > [0.038s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:110) > [0.038s][info][exceptions ] Exception > [ ] thrown in interpreter method <{method} {0x000074c408400670} 'foo2' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 0 for thread 0x000074c46402c7b0 (main) > [0.038s][info][exceptions ] Found m... Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: - refactor - Reimplement -- print stack trace only when it is a known throwing site ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/10e94797..2a5bdd82 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=07-08 Stats: 114 lines in 4 files changed: 58 ins; 13 del; 43 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From kbarrett at openjdk.org Fri Jun 20 17:23:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 17:23:28 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: <_8DB4If43Y1CAFDGw52in15ejHmNZ4Uc1YI4p1eccaA=.08537c54-408d-48f5-ba52-55578507c7b9@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> <_8DB4If43Y1CAFDGw52in15ejHmNZ4Uc1YI4p1eccaA=.08537c54-408d-48f5-ba52-55578507c7b9@github.com> Message-ID: On Fri, 20 Jun 2025 16:44:16 GMT, Vladimir Kozlov wrote: >> The standard allows `noexcept` to be used in two different contexts. It can be >> used in a function's exception specification. It can instead be used in an >> ordinary expression. In either case it takes one argument expression, but what >> form that can take and how it is interpreted is quite different, depending on >> which of those contexts applies. >> >> Used in an exception specification, the argument must be a constant >> expression. So one can write a conditional exception specification, indicating >> that a function is non-throwing if the argument is true, and potentially >> throwing if false. One use-case for this is a generic container, where some >> operations may only be noexept if certain operations of the element type are >> noexcept. In this context, `noexcept` (without arguments) is equivalent to >> `noexcept(true)`, and also to the deprecated `throw()` (with no arguments). At >> this time I think we don't ever need arguments, and the no-arg abbreviation is >> sufficient. That might change if we move toward using standard library >> facilities (particularly containers), or make our own more like those in the >> standard. >> >> Used as an expression, the argument is an unevaluated expression (similar to >> some uses of sizeof and alignof). The result of the `noexcept` expression is a >> bool constant, and is true if the expression never throws (based on examining >> exception specifications of functions called in the expression), false >> otherwise. There isn't an abbreviated expression form without arguments, as >> that doesn't make sense. At this time I think we don't need `noexcept` >> expressions, and might never need them. >> >> (Note that the argument expression for a `noexcept` exception specification >> can include `noexcept` expressions. But since we're not currently allowing >> either, that doesn't matter.) > > Got it. No `noexcept` in regular expressions. No `noexcept(foo())`. > > But we can use `foo() noexcept(true);` which is equivalent to ``foo() noexcept;` > > Do we have case for `noexcept(false);` in VM? The guidance is to use `foo() noexcept` rather than `foo() noexcept(true)`. The guidance is to not use an exception specification of `noexcept(false)`. That's equivalent to not having an exception specification. So reduce clutter and leave it off. (I could imagine a different style guide for a different, exception-using project, that said all functions should have explicit exception specifications. I've never seen such. Beyond that, I don't think there's a use case for `noexcept(false)` at all, VM or not.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159412800 From kbarrett at openjdk.org Fri Jun 20 17:29:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 17:29:28 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Fri, 20 Jun 2025 16:53:54 GMT, Vladimir Kozlov wrote: >> An allocation function may be declared as non-throwing (via `noexcept`). >> >> If not declared non-throwing (so "potentially throwing", even though in a >> no-exceptions build environment it won't ever actually throw, and will instead >> probably terminate the program), the compiler's implementation of a using >> `new` expression can assume the allocation function will never return null, >> and does not need to generate any code to handle that possibility. >> >> If declared non-throwing, a using `new` expression must account for the >> possibility that the allocation function might return null. It needs to test >> the result of the allocation function. If it is null then the `new` expression >> must not call the constructor for the type and must return null as its result. >> (Of course, usual compiler optimizations apply, and the null handling code can >> be elided if the compiler can prove the allocation does not in fact ever >> return null, regardless of any exception specifications. That's probably >> pretty much never the case though.) >> >> Because of that, an allocation function that can return null *must* be >> declared non-throwing. And an allocation function that never returns null >> shouldn't be declared non-throwing, as that's just degrading performance for >> no reason. > > So we need to have explicit `nullptr` checks or `assert(p != nullprt,"")` in our code. > > For example, for next `new()` we can add `assert(p != nullprt,"")` because `AllocateHeap` will exit VM if it can't allocate memory. And we should not use `noexcept` here or we can use `noexcept(false)`. Right? > > > ALWAYSINLINE void* operator new(size_t size, MemTag mem_tag) { > return AllocateHeap(size, mem_tag); > } We could assert, but we don't need to. It's clutter. The usual strategy is that an allocation function that can return null (and so must be declared `noexcept`) needs a specific argument, like the `std::nothrow` variants, to indicate that possibility. For such cases, if the "non-throwing" variant isn't explicitly asked for, assume it won't ever return null. Some of our allocators don't follow that strategy, and that's arguably unfortunate in some cases. But an assert really isn't a good way to handle the possibility, since that means nothing in release builds. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25574#discussion_r2159420303 From iklam at openjdk.org Fri Jun 20 17:30:29 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 20 Jun 2025 17:30:29 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v8] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Fri, 20 Jun 2025 13:32:46 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @dholmes-ora comments -- removed printing of output.getStdout() from test > > src/hotspot/share/utilities/exceptions.cpp line 619: > >> 617: // We don't want to use an OopHandle, or else we may prevent this object from being collected. >> 618: // Whenever a GC happens, this will be cleared by Exceptions::clear_logging_cache(). >> 619: static oop _last_logged_exception; > > oh gosh I don't like this at all. Save the exception string if anything. I've got rid of the caching completely. Now the stack trace is printed only at sites that are actually doing a "throw". We still have some extraneous stack traces that are printed at the exit of a `finally` block. So if you have code that looks like this: try (A a = new A()) { try (B b = new B()) { try (C c = new C()) { ((Object)(null)).toString(); } } } You will see the stacktrace 4 times. But I think that's acceptable for now. I added a suggestion for fixing that in comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2159421444 From kvn at openjdk.org Fri Jun 20 18:13:28 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 20 Jun 2025 18:13:28 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: <1Jp3HPd0qv6B9KzOdbx1wjYQndS5Kuq1UU2dqzyflRk=.7c35addd-c731-40b1-96a4-2a372e8b1e6f@github.com> On Tue, 3 Jun 2025 11:01:06 GMT, Kim Barrett wrote: >> Please review this change to permit the use of `noexcept` under certain >> circumstances in HotSpot code. >> >> http://wg21.link/n3050 >> >> Testing: >> >> JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the >> conversion would look like. It will need to be brought up to current mainline, >> possibly with modifications. >> >> This is a modification of the Style Guide, so rough consensus among the >> HotSpot Group members is required to make this change. Only Group members >> should vote for approval (via the github PR), though reasoned objections or >> comments from anyone will be considered. A decision on this proposal will not >> be made before Friday 16-June-2025 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process >> to approve (click on Review Changes > Approve), rather than sending a "vote: >> yes" email reply that would be normal for a CFV. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > more dholmes Good. Thank you for answering my questions. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25574#pullrequestreview-2946976284 From coleenp at openjdk.org Fri Jun 20 18:24:29 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 18:24:29 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v9] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Fri, 20 Jun 2025 17:13:08 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: > > - refactor > - Reimplement -- print stack trace only when it is a known throwing site src/hotspot/share/utilities/exceptions.cpp line 635: > 633: // and Exceptions::_throw()). > 634: void Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci) { > 635: if (method->is_native() || (Bytecodes::Code) *method->bcp_from(bci) == Bytecodes::_athrow) { Do you mean !method->is_native() && athrow? How do you get here from a native method ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2159485434 From coleenp at openjdk.org Fri Jun 20 18:42:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 18:42:55 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v9] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: David's comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/50c036f4..1c1924c1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=07-08 Stats: 6 lines in 2 files changed: 0 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From coleenp at openjdk.org Fri Jun 20 18:42:56 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 18:42:56 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: <5m8EQOcI-LNXa8-gM_xxhwtg2vIW2sNnyJnJKuSRH18=.73adf555-921c-4ae3-b623-50de42252fce@github.com> On Thu, 19 Jun 2025 00:07:49 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add a basic gtest. > > src/hotspot/share/oops/instanceKlass.cpp line 2395: > >> 2393: >> 2394: // Allocate the jmethodID cache. >> 2395: static jmethodID* create_jmethod_id_cache(size_t size) { > > Why isn't this used at line 2439 to create the (initial?) cache? No reason. I should have done that. > src/hotspot/share/oops/jmethodIDTable.cpp line 40: > >> 38: static uint64_t _jmethodID_counter = 0; >> 39: // Tracks the number of jmethodID entries in the _jmethod_id_table. >> 40: // Incremented on insert, decremented on remove. Use to track if we need to resize the table. > > Suggestion: > > // Incremented on insert, decremented on remove. Used to track if we need to resize the table. fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2159503221 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2159502540 From coleenp at openjdk.org Fri Jun 20 19:44:27 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 19:44:27 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v4] In-Reply-To: References: Message-ID: <__1QutaUqqg-7hN3eC1uYYnZQLU0GRg4gW9DLHDgas0=.0c915dc5-05af-4f66-8d62-cc3249f0ee8e@github.com> On Fri, 20 Jun 2025 15:05:03 GMT, Coleen Phillimore wrote: >> This uses names for frame types for stackmaps in the verifier and redefinition. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix tags (running more tests) tiers 1-4 passed with changes suggested by @dholmes-ora ------------- PR Comment: https://git.openjdk.org/jdk/pull/25870#issuecomment-2992624666 From matsaave at openjdk.org Fri Jun 20 19:49:29 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 20 Jun 2025 19:49:29 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v4] In-Reply-To: References: Message-ID: <0p-eoUP7uOmwzFwfUQseiJQGKisRaa4EQCHJXuVFvwA=.58c3a268-28a8-4c53-9281-bb54c24c7d9b@github.com> On Fri, 20 Jun 2025 15:05:03 GMT, Coleen Phillimore wrote: >> This uses names for frame types for stackmaps in the verifier and redefinition. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix tags (running more tests) Updated change looks good! ------------- Marked as reviewed by matsaave (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25870#pullrequestreview-2947143580 From kbarrett at openjdk.org Fri Jun 20 19:51:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 19:51:32 GMT Subject: RFR: 8255082: HotSpot Style Guide should permit noexcept [v3] In-Reply-To: References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Tue, 3 Jun 2025 11:01:06 GMT, Kim Barrett wrote: >> Please review this change to permit the use of `noexcept` under certain >> circumstances in HotSpot code. >> >> http://wg21.link/n3050 >> >> Testing: >> >> JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the >> conversion would look like. It will need to be brought up to current mainline, >> possibly with modifications. >> >> This is a modification of the Style Guide, so rough consensus among the >> HotSpot Group members is required to make this change. Only Group members >> should vote for approval (via the github PR), though reasoned objections or >> comments from anyone will be considered. A decision on this proposal will not >> be made before Friday 16-June-2025 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process >> to approve (click on Review Changes > Approve), rather than sending a "vote: >> yes" email reply that would be normal for a CFV. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > more dholmes Thanks for reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25574#issuecomment-2992633837 From kbarrett at openjdk.org Fri Jun 20 19:51:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 20 Jun 2025 19:51:32 GMT Subject: Integrated: 8255082: HotSpot Style Guide should permit noexcept In-Reply-To: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> References: <-uPcWRhBsfKiRl5wRkLQ7YaAH4OCOlT0_ettXJQnUyY=.aa5c72c3-6767-41dd-8dae-45ff9a9e4884@github.com> Message-ID: On Mon, 2 Jun 2025 05:28:17 GMT, Kim Barrett wrote: > Please review this change to permit the use of `noexcept` under certain > circumstances in HotSpot code. > > http://wg21.link/n3050 > > Testing: > > JDK-8316930 (HotSpot should use noexcept instead of throw()) showed what the > conversion would look like. It will need to be brought up to current mainline, > possibly with modifications. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 16-June-2025 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This pull request has now been integrated. Changeset: 96f71a9a Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/96f71a9a6bf7b52c50a1f52d4d401a48dc40480f Stats: 104 lines in 2 files changed: 104 ins; 0 del; 0 mod 8255082: HotSpot Style Guide should permit noexcept Reviewed-by: kvn, dholmes, dcubed ------------- PR: https://git.openjdk.org/jdk/pull/25574 From coleenp at openjdk.org Fri Jun 20 20:41:21 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 20:41:21 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix the test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/1c1924c1..66eb4269 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=08-09 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From coleenp at openjdk.org Fri Jun 20 20:50:33 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 20:50:33 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 00:45:30 GMT, David Holmes wrote: > This still seems racy though. What if the lookup succeeds but at the same time the class is to be unloaded? Yes, It won't be at the same time, since lookup and remove_jmethod_ids hold the JmethodIdCreation_lock. But this is a good question. It could be still a race: Method* Method::checked_resolve_jmethod_id(jmethodID mid) { if (mid == nullptr) return nullptr; Method* o = resolve_jmethod_id(mid); if (o == nullptr) { return nullptr; } // Method should otherwise be valid. Assert for testing. assert(is_valid_method(o), "should be valid jmethodid"); // If the method's class holder object is unreferenced, but not yet marked as // unloaded, we need to return null here too because after a safepoint, its memory // will be reclaimed. return o->method_holder()->is_loader_alive() ? o : nullptr; } I think this should prevent this race but I'd have to think about it some more. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2992814813 From coleenp at openjdk.org Fri Jun 20 20:57:31 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 20:57:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 06:33:08 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add a basic gtest. > > I feel apprehensive about this; the solution feels pretty complex and I am not fully convinced this is the simplest solution for this problem. > > How much space to we lose in real life? Side note: I see the payload of the jmethodID block in NMT is allocated with mtInternal, so we don't see it in NMT. We should add jmethodIDs as an own category to NMT. > > A pragmatic alternative solution could be to do delete them, but delayed: keep the last N methodblocks undeleted. It is rare that JNI accesses jmethodIDs long after they have been deleted. Typically, the bad access happens close after class unloading, e.g. because of concurrency problems in customer code. > > We could then make the parameter N configurable, and thus give customers and supporters a tool to check for these kind of errors. > > (I briefly wondered whether we could just mmap these blocks, and uncommit/mprotect them on release, so that we stop paying the memory costs but don't release the address space; but the coarser page size allocation granularity would make this probably forbidding in terms of mem cost per class) @tstuefe I think your method would still have a race and still allow a stale pointer crash, but just less likely, with more code to manage how to delete these pointers. The memory leak for jmethodIDs has been observed in our testing, resulting in native OOMs. We discussed a few things to solve this in the bugs, and this was the best idea we had at the time. It is a big change though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2992826422 From coleenp at openjdk.org Fri Jun 20 21:19:28 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 21:19:28 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: <4QtV9UGiRdP0LG6rIHH_4fwO2f5wm0evnHnut2nxvjM=.ecf61194-7201-4c7e-a496-3ddc821ea079@github.com> References: <4QtV9UGiRdP0LG6rIHH_4fwO2f5wm0evnHnut2nxvjM=.ecf61194-7201-4c7e-a496-3ddc821ea079@github.com> Message-ID: On Thu, 19 Jun 2025 10:02:45 GMT, Aleksey Shipilev wrote: >> Thank you for the backport! @shipilev indicated that the backport to 21 should wait a bit, could you clarify when should I file that (e.g. end of July, ...)? > >> @shipilev indicated that the backport to 21 should wait a bit, could you clarify when should I file that (e.g. end of July, ...)? > > I would say for the fairly big change like this, we want to wait until JDK 25 GA (that would pass the all-tests-run). It would be too late for Oct 2025 release, though. So realistically, this would target January 2026 release. You can pull this patch to your downstream JDK 21 to see if there are any troubles ahead of this path, this will also soothe 21u maintainer concerns, I think. So @shipilev, JDK 25 is in rampdown phase 1. Rampdown phase 2 starts 7/17. It's a big change for JDK 25. My sense of the risk is that it's not high. If it's checked in now, I think it'll get more testing than if it gets backported to a 25.01 release. I'll discuss it with people here. I think for JDK 21, we should wait for more testing to be run after 25 GA. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-2992962430 From iklam at openjdk.org Fri Jun 20 21:31:13 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 20 Jun 2025 21:31:13 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v10] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.042s][info][exceptions] Exception > [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) > [0.042s][info][exceptions,stacktrace] Exception > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) > [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 > Exception 1 caught. > > > - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. > > - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: > - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` > - By native code in Exceptions::special_exception() and and Exceptions::_throw()). > > **Concurrent Exceptions** > > Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/2a5bdd82..cd41e2ab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=08-09 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From iklam at openjdk.org Fri Jun 20 21:31:13 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 20 Jun 2025 21:31:13 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v9] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: <9vIhH763AqWgBLopVg1RMsS1t5ofNFCnpPBl6UwT4mY=.6e93a04e-34b9-48ce-8749-4fe8d5a2c5b6@github.com> On Fri, 20 Jun 2025 18:21:48 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: >> >> - refactor >> - Reimplement -- print stack trace only when it is a known throwing site > > src/hotspot/share/utilities/exceptions.cpp line 635: > >> 633: // and Exceptions::_throw()). >> 634: void Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci) { >> 635: if (method->is_native() || (Bytecodes::Code) *method->bcp_from(bci) == Bytecodes::_athrow) { > > Do you mean !method->is_native() && athrow? How do you get here from a native method ? I fixed it. I thought we could get there with a native method, but looking at `InterpreterRuntime::exception_handler_for_exception()`, we can't. Anyway, I left the `!method->is_native()` there just for safety. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2159683183 From coleenp at openjdk.org Fri Jun 20 21:34:29 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Jun 2025 21:34:29 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v10] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Fri, 20 Jun 2025 21:31:13 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files The !method->is_native() looks a bit strange to me but the whole change is okay. I'd somewhat prefer all this commentary in the bug rather than the source code and would be happy to approve again if you move it there. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2947310272 From cslucas at openjdk.org Sat Jun 21 04:34:35 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Sat, 21 Jun 2025 04:34:35 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v4] In-Reply-To: <2-6lbSM0y22WVEiOqLJ31lu8LkA-Ik1O4nr6eb1vpoo=.d87f5b56-7113-42a7-962a-94eb3c2ac1c7@github.com> References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> <2-6lbSM0y22WVEiOqLJ31lu8LkA-Ik1O4nr6eb1vpoo=.d87f5b56-7113-42a7-962a-94eb3c2ac1c7@github.com> Message-ID: On Fri, 20 Jun 2025 03:01:07 GMT, Quan Anh Mai wrote: >> src/hotspot/share/utilities/sort.hpp line 57: >> >>> 55: // backward) >>> 56: T* prev = pos - 1; >>> 57: if (comp(*prev, current_elem) <= 0) { >> >> NIT: would be better to pass pointers here? > > A `comp` usually receives references. Practically, it is almost the same as receiving pointers. Apologies, I didn't notice it was a reference. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2159860763 From aph at openjdk.org Sat Jun 21 12:21:31 2025 From: aph at openjdk.org (Andrew Haley) Date: Sat, 21 Jun 2025 12:21:31 GMT Subject: RFR: 8346914: UB issue in scalbnA In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 16:06:24 GMT, Kim Barrett wrote: > Please review this change that replaces uses of our scalbnA function with > using the standard scalbn function. Removed scalbnA, and also copysignA. > > For more details, see first comment and JBS. > > Testing: mach5 tier1-6. Though from discussion in > https://github.com/openjdk/jdk/pull/25656, it's hard to get to our uses of > scalbn/scalbnA. > > Before removing it, I tested scalbnA via a gtest that is attached to the JBS > issue. That makes sense. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25917#pullrequestreview-2947750811 From kbarrett at openjdk.org Sun Jun 22 22:24:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 22 Jun 2025 22:24:35 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 12:26:15 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Albert suggestions Changes requested by kbarrett (Reviewer). src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 894: > 892: > 893: bool should_expand; > 894: size_t resize_bytes = _heap_sizing_policy->full_collection_resize_amount(should_expand, allocation_word_size); pre-existing: I wonder why this isn't a function that returns a `ptrdiff_t` delta on the current size, removing the need for multiple values, one being returned via a by-reference out parameter. Similarly for the young collection case. Or return the size & direction as a pair-like object. (Personally, I find by-reference out parameters confusing to read, but maybe that's just me.) src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1028: > 1026: > 1027: uint num_regions_to_expand = (uint)(aligned_expand_bytes / G1HeapRegion::GrainBytes); > 1028: assert(num_regions_to_expand > 0, "Must expand by at least one region"); This assert seems like needless clutter to me. We just aligned up, and then we divided. A (to me) more useful assert would be up-front that `word_size` > 0. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1042: > 1040: uint expanded_by = _hrm.expand_by(num_regions_to_expand, pretouch_workers); > 1041: > 1042: assert(expanded_by > 0, "must have failed during commit."); pre-existing: Using an assert to detect and "handle" this seems wrong. This seems like something that simply can happen? And so should be dealt with in some graceful fashion. It actually seems like things would more or less work if the assert was just removed. src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 563: > 561: void resize_heap_after_young_collection(size_t allocation_word_size); > 562: void resize_heap_after_full_collection(size_t allocation_word_size); > 563: void resize_heap(size_t resize_bytes, bool should_expand); I think `resize_heap` is a helper for the other two? So shouldn't be public. src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1465: > 1463: > 1464: if (_g1h->last_gc_was_periodic()) { > 1465: _g1h->resize_heap_after_full_collection(size_t(0) /* allocation_word_size */); I don't think a cast is needed here. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 93: > 91: // Sigmoid Parameters: > 92: double inflection_point = 1.0; // Inflection point where acceleration begins (midpoint of sigmoid). > 93: double steepness = 6.0; Curious as to how these constants were determined. Maybe a comment about that? src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 95: > 93: double steepness = 6.0; > 94: > 95: return 1.0 / (1.0 + pow(M_E, -steepness * (value - inflection_point))); Can use `exp` rather than `pow`. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 165: > 163: // Ensure the expansion size is at least the minimum growth amount > 164: // and at most the remaining uncommitted byte size. > 165: return clamp((size_t)resize_bytes, min_expand_bytes, uncommitted_bytes); Don't need a cast of `resize_bytes` here. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 302: > 300: if (_g1h->capacity() == _g1h->max_capacity()) { > 301: log_resize(short_term_pause_time_ratio, long_term_pause_time_ratio, > 302: lower_threshold, upper_threshold, pause_time_threshold, true, 0, expand); We're on the expand-side of things, but `expand` hasn't been set to true yet. Similarly, on the shrink side, we'll be passing in the default initial value for `expand` and then setting it later. Maybe `expand` should be set earlier on each of the branches. ------------- PR Review: https://git.openjdk.org/jdk/pull/25832#pullrequestreview-2935176410 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2152197151 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160408699 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160409675 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160411832 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160412727 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160414808 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160414939 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160416276 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2160508087 From wenanjian at openjdk.org Mon Jun 23 02:30:58 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Mon, 23 Jun 2025 02:30:58 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false Message-ID: This disables BigInteger.multiplyToLen(), BigInteger.squareToLen(), BigInteger.montgomeryMultiply() and BigInteger.montgomerySquare() on linux-riscv64 platforms where misaligned memory accesses is slow. The reason is that these four BigInteger intrinsics do 8-byte misaligned memory accesses to int arrays under -XX:+UseCompactObjectHeaders. And this will have a negative impact on SPECJvm2008 crypto tests. Testing: - [x] Tier1-3 tests. - [x] SPECJvm2008 crypto performance tests. [score.txt](https://github.com/user-attachments/files/20846531/score.txt) ------------- Commit messages: - RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false Changes: https://git.openjdk.org/jdk/pull/25923/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25923&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360179 Stats: 92 lines in 2 files changed: 21 ins; 62 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/25923.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25923/head:pull/25923 PR: https://git.openjdk.org/jdk/pull/25923 From fjiang at openjdk.org Mon Jun 23 02:57:28 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Mon, 23 Jun 2025 02:57:28 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false In-Reply-To: References: Message-ID: <_60dXdblAsVuBuPG5q8JF9RsTsqLjx0zg0C41Wd9kro=.fc2ac6dc-5714-4ffb-b5b9-9e89cab0900c@github.com> On Sat, 21 Jun 2025 12:23:57 GMT, Anjian Wen wrote: > This disables BigInteger.multiplyToLen(), BigInteger.squareToLen(), BigInteger.montgomeryMultiply() > and BigInteger.montgomerySquare() on linux-riscv64 platforms where misaligned memory accesses is slow. > The reason is that these four BigInteger intrinsics do 8-byte misaligned memory accesses to int arrays > under -XX:+UseCompactObjectHeaders. And this will have a negative impact on SPECJvm2008 crypto tests. > > Testing: > - [x] Tier1-3 tests. > - [x] SPECJvm2008 crypto performance tests. > [score.txt](https://github.com/user-attachments/files/20846531/score.txt) Changes requested by fjiang (Committer). src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 5624: > 5622: Label L_second_loop_unaligned, L_third_loop, L_third_loop_exit; > 5623: > 5624: multiply_32_x_32_loop(x, xstart, x_xstart, y, y_idx, z, carry, product, idx, kdx); `multiply_32_x_32_loop` was only used in `AvoidUnalignedAccesses` case. It could be removed too. ------------- PR Review: https://git.openjdk.org/jdk/pull/25923#pullrequestreview-2948525135 PR Review Comment: https://git.openjdk.org/jdk/pull/25923#discussion_r2160627705 From wenanjian at openjdk.org Mon Jun 23 03:28:20 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Mon, 23 Jun 2025 03:28:20 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false [v2] In-Reply-To: References: Message-ID: > This disables BigInteger.multiplyToLen(), BigInteger.squareToLen(), BigInteger.montgomeryMultiply() > and BigInteger.montgomerySquare() on linux-riscv64 platforms where misaligned memory accesses is slow. > The reason is that these four BigInteger intrinsics do 8-byte misaligned memory accesses to int arrays > under -XX:+UseCompactObjectHeaders. And this will have a negative impact on SPECJvm2008 crypto tests. > > Testing: > - [x] Tier1-3 tests. > - [x] SPECJvm2008 crypto performance tests. > [score.txt](https://github.com/user-attachments/files/20846531/score.txt) Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: remove useless func multiply_32_x_32_loop which only used in AvoidUnalignedAccesses case! ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25923/files - new: https://git.openjdk.org/jdk/pull/25923/files/f33ed75b..17970631 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25923&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25923&range=00-01 Stats: 40 lines in 2 files changed: 0 ins; 40 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25923.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25923/head:pull/25923 PR: https://git.openjdk.org/jdk/pull/25923 From wenanjian at openjdk.org Mon Jun 23 03:28:20 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Mon, 23 Jun 2025 03:28:20 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false [v2] In-Reply-To: <_60dXdblAsVuBuPG5q8JF9RsTsqLjx0zg0C41Wd9kro=.fc2ac6dc-5714-4ffb-b5b9-9e89cab0900c@github.com> References: <_60dXdblAsVuBuPG5q8JF9RsTsqLjx0zg0C41Wd9kro=.fc2ac6dc-5714-4ffb-b5b9-9e89cab0900c@github.com> Message-ID: <-HWflc7_Z5ALB2SaliXUtfNtkKrhFNloRB0-TRcw64s=.600a2135-f852-43cc-bfb1-3ec06536e6ed@github.com> On Mon, 23 Jun 2025 02:51:37 GMT, Feilong Jiang wrote: >> Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: >> >> remove useless func multiply_32_x_32_loop which only used in AvoidUnalignedAccesses case! > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 5624: > >> 5622: Label L_second_loop_unaligned, L_third_loop, L_third_loop_exit; >> 5623: >> 5624: multiply_32_x_32_loop(x, xstart, x_xstart, y, y_idx, z, carry, product, idx, kdx); > > `multiply_32_x_32_loop` was only used in `AvoidUnalignedAccesses` case. It could be removed too. Thanks for the reviews?fixed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25923#discussion_r2160648157 From dholmes at openjdk.org Mon Jun 23 05:39:35 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 23 Jun 2025 05:39:35 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:47:37 GMT, Coleen Phillimore wrote: > I think this should prevent this race but I'd have to think about it some more. Aside from the fact the `checked` method is not used much, the problem is that if the caller does not have something keeping the class alive, then the resolution of the jMethodID can succeed and we will proceed with trying to call the method. In the meantime the fact the class is unreferenced could be noticed and the class then unloaded. Now that can only happen at safepoints, so it then depends on the details of the code that tries to invoke the method e.g. in jni.cpp static void jni_invoke_static(JNIEnv *env, JavaValue* result, jobject receiver, JNICallType call_type, jmethodID method_id, JNI_ArgumentPusher *args, TRAPS) { methodHandle method(THREAD, Method::resolve_jmethod_id(method_id)); // Create object to hold arguments for the JavaCall, and associate it with // the jni parser ResourceMark rm(THREAD); int number_of_parameters = method->size_of_parameters(); JavaCallArguments java_args(number_of_parameters); assert(method->is_static(), "method should be static"); // Fill out JavaCallArguments object args->push_arguments_on(&java_args); // Initialize result type result->set_type(args->return_type()); // Invoke the method. Result is returned as oop. JavaCalls::call(result, method, &java_args, CHECK); Can we hit safepoint checks anywhere on the path to the actual invocation of the method? If not, what is guaranteeing that? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2994989682 From epeter at openjdk.org Mon Jun 23 05:54:32 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 23 Jun 2025 05:54:32 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32(C) on Ampere CPU and improve for short inputs [v4] In-Reply-To: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Thu, 5 Jun 2025 07:15:34 GMT, Liming Liu wrote: >> This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems. >> >> 1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic. >> >> 2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms. >> >> The performance regressions and improvements were measured with the following microbenchmarks: >> org.openjdk.bench.java.util.TestCRC32.testCRC32Update >> org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate >> >> Ran the following JTReg tests on Ampere1 and did not find problems: >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java > > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Add the message for the assertions Looks like this is progressing nicely here :) I have 2 comments below. Once you addressed them, I'll run some internal testing, and then I can approve it :) src/hotspot/cpu/aarch64/globals_aarch64.hpp line 92: > 90: product(bool, UseCryptoPmullForCRC32, false, \ > 91: "Use Crypto PMULL instructions for CRC32 computation") \ > 92: product(uint, CryptoPmullForCRC32LowLimit, 256, DIAGNOSTIC, \ Can you please add a test that uses this flag, and sets it to some selected values, and maybe even a random value? src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 126: > 124: warning("CryptoPmullForCRC32LowLimit must be a multiple of 128"); > 125: CryptoPmullForCRC32LowLimit = align_down(CryptoPmullForCRC32LowLimit, 128); > 126: } Can you describe somewhere why it has to be a multiple of `128`? Imagine someone comes across this later, and wonders if that is just some strange implementation limitation or something more fundamental, or something very subtle. ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25609#pullrequestreview-2948733798 PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2160764901 PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2160767137 From epeter at openjdk.org Mon Jun 23 05:57:37 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 23 Jun 2025 05:57:37 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32(C) on Ampere CPU and improve for short inputs [v4] In-Reply-To: References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Mon, 23 Jun 2025 05:48:48 GMT, Emanuel Peter wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Add the message for the assertions > > src/hotspot/cpu/aarch64/globals_aarch64.hpp line 92: > >> 90: product(bool, UseCryptoPmullForCRC32, false, \ >> 91: "Use Crypto PMULL instructions for CRC32 computation") \ >> 92: product(uint, CryptoPmullForCRC32LowLimit, 256, DIAGNOSTIC, \ > > Can you please add a test that uses this flag, and sets it to some selected values, and maybe even a random value? Is there already an IR test that checks for the presence of the crypto pmull? That could be good to ensure it occurs as expected and only when expected :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2160771012 From dholmes at openjdk.org Mon Jun 23 06:12:34 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 23 Jun 2025 06:12:34 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v10] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Fri, 20 Jun 2025 21:31:13 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files This simplified version of the fix is also okay. If people have issues using it then we can address those as they arise. A couple of nits below. Thanks src/hotspot/share/utilities/exceptions.cpp line 156: > 154: // tracing (do this up front - so it works during boot strapping) > 155: // Note, the print_value_string() argument is not called unless logging is enabled! > 156: log_info(exceptions)("Exception <%.*s%s%.*s> (" PTR_FORMAT ") \n" The way `log_info` et al. work already implicitly checks if the logging is enabled and otherwise it doesn't evaluate anything else, so there is no need to make this change here. In the previous case there was some extra code that is now no longer executed unless needed so that was okay. The key point is that in general you do not need to guard simple `log_xxx(foo)(...)` statements with `log_is_enabled`. src/hotspot/share/utilities/exceptions.cpp line 637: > 635: if (!method->is_native() && (Bytecodes::Code) *method->bcp_from(bci) == Bytecodes::_athrow) { > 636: // TODO: it would be nice to filter out exceptions re-thrown by finally blocks (which include > 637: // try-with-resource statements): This mega-comment really doesn't belong here. Please ensure this discussion is in JBS and just use a short comment here e.g. // TODO: try to find a way to avoid repeated stacktraces when an exception gets re-thrown by a finally block ------------- PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2948757369 PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2160780013 PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2160784505 From stuefe at openjdk.org Mon Jun 23 06:14:31 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 23 Jun 2025 06:14:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v7] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 06:33:08 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add a basic gtest. > > I feel apprehensive about this; the solution feels pretty complex and I am not fully convinced this is the simplest solution for this problem. > > How much space to we lose in real life? Side note: I see the payload of the jmethodID block in NMT is allocated with mtInternal, so we don't see it in NMT. We should add jmethodIDs as an own category to NMT. > > A pragmatic alternative solution could be to do delete them, but delayed: keep the last N methodblocks undeleted. It is rare that JNI accesses jmethodIDs long after they have been deleted. Typically, the bad access happens close after class unloading, e.g. because of concurrency problems in customer code. > > We could then make the parameter N configurable, and thus give customers and supporters a tool to check for these kind of errors. > > (I briefly wondered whether we could just mmap these blocks, and uncommit/mprotect them on release, so that we stop paying the memory costs but don't release the address space; but the coarser page size allocation granularity would make this probably forbidding in terms of mem cost per class) > @tstuefe I think your method would still have a race and still allow a stale pointer crash, but just less likely, with more code to manage how to delete these pointers. > > The memory leak for jmethodIDs has been observed in our testing, resulting in native OOMs. We discussed a few things to solve this in the bugs, and this was the best idea we had at the time. It is a big change though. Okay, fair enough. Thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2995060097 From epeter at openjdk.org Mon Jun 23 06:19:29 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 23 Jun 2025 06:19:29 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v3] In-Reply-To: <38Q5LqxUlyIBHyusziq461sTdVCBm0ZO3kcVB_u7I18=.11da0a73-7a4f-4c83-abeb-f12791ad7741@github.com> References: <38Q5LqxUlyIBHyusziq461sTdVCBm0ZO3kcVB_u7I18=.11da0a73-7a4f-4c83-abeb-f12791ad7741@github.com> Message-ID: On Fri, 20 Jun 2025 16:39:51 GMT, Beno?t Maillard wrote: >> This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. >> >> ### Testing >> - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) >> - [x] tier1-3, plus some internal testing >> - [x] Manual testing with values known to previously cause undefined behavior >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: > > 8356865: Change assert test/hotspot/jtreg/compiler/arguments/TestFastAllocateSizeLimit.java line 48: > 46: public static void main(String[] args) throws IOException { > 47: if (args.length == 0) { > 48: int sizeLimit = RANDOM.nextInt(1 << 28); Can you please add a quick comment why you chose `1 << 28`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2160796211 From dholmes at openjdk.org Mon Jun 23 06:35:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 23 Jun 2025 06:35:31 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v4] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 15:05:03 GMT, Coleen Phillimore wrote: >> This uses names for frame types for stackmaps in the verifier and redefinition. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix tags (running more tests) Thanks for applying this @coleenp ! I have a couple of suggested changes that to me make it more obvious what case(s) we are dealing with, but up to you whether to apply them or not. The more I look at this code the more it cries out to be restructured, to me. Thanks src/hotspot/share/classfile/stackMapTable.cpp line 384: > 382: _first = false; > 383: return frame; > 384: } else if (frame_type < SAME_FRAME_EXTENDED + 4) { Suggestion: } else if (frame_type <= APPEND_FRAME_END) { src/hotspot/share/classfile/stackMapTable.cpp line 387: > 385: // append_frame > 386: assert(frame_type >= APPEND_FRAME_START && frame_type <= APPEND_FRAME_END, "should be"); > 387: int appends = frame_type - SAME_FRAME_EXTENDED; Suggestion: int appends = frame_type - APPEND_FRAME_START + 1; src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 3328: > 3326: "no room for offset_delta"); > 3327: stackmap_p += 2; > 3328: u1 len = frame_type - StackMapReader::SAME_FRAME_EXTENDED; Suggestion: u1 len = frame_type - StackMapReader::APPEND_FRAME_START + 1; ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25870#pullrequestreview-2948793293 PR Review Comment: https://git.openjdk.org/jdk/pull/25870#discussion_r2160802174 PR Review Comment: https://git.openjdk.org/jdk/pull/25870#discussion_r2160805107 PR Review Comment: https://git.openjdk.org/jdk/pull/25870#discussion_r2160809684 From fjiang at openjdk.org Mon Jun 23 06:36:28 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Mon, 23 Jun 2025 06:36:28 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false [v2] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 03:28:20 GMT, Anjian Wen wrote: >> This disables BigInteger.multiplyToLen(), BigInteger.squareToLen(), BigInteger.montgomeryMultiply() >> and BigInteger.montgomerySquare() on linux-riscv64 platforms where misaligned memory accesses is slow. >> The reason is that these four BigInteger intrinsics do 8-byte misaligned memory accesses to int arrays >> under -XX:+UseCompactObjectHeaders. And this will have a negative impact on SPECJvm2008 crypto tests. >> >> Testing: >> - [x] Tier1-3 tests. >> - [x] SPECJvm2008 crypto performance tests. >> [score.txt](https://github.com/user-attachments/files/20846531/score.txt) > > Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: > > remove useless func multiply_32_x_32_loop which only used in AvoidUnalignedAccesses case! Looks good, thanks! ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/25923#pullrequestreview-2948820271 From fyang at openjdk.org Mon Jun 23 07:05:33 2025 From: fyang at openjdk.org (Fei Yang) Date: Mon, 23 Jun 2025 07:05:33 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false [v2] In-Reply-To: References: Message-ID: <5IpWl9LVBT16gHaLjYrRPzXi6ii9pMNcJOYpB8if9ms=.dc383b5f-f907-4380-b50d-06a7d8696978@github.com> On Mon, 23 Jun 2025 03:28:20 GMT, Anjian Wen wrote: >> This disables BigInteger.multiplyToLen(), BigInteger.squareToLen(), BigInteger.montgomeryMultiply() >> and BigInteger.montgomerySquare() on linux-riscv64 platforms where misaligned memory accesses is slow. >> The reason is that these four BigInteger intrinsics do 8-byte misaligned memory accesses to int arrays >> under -XX:+UseCompactObjectHeaders. And this will have a negative impact on SPECJvm2008 crypto tests. >> >> Testing: >> - [x] Tier1-3 tests. >> - [x] SPECJvm2008 crypto performance tests. >> [score.txt](https://github.com/user-attachments/files/20846531/score.txt) > > Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: > > remove useless func multiply_32_x_32_loop which only used in AvoidUnalignedAccesses case! Thanks! ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25923#pullrequestreview-2948897930 From jsjolen at openjdk.org Mon Jun 23 07:08:37 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 23 Jun 2025 07:08:37 GMT Subject: RFR: 8357220: Introduce a BSMAttributeEntry struct [v7] In-Reply-To: References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Thu, 19 Jun 2025 08:03:20 GMT, Johan Sj?len wrote: >> Hi, >> >> The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: >> >> ```c++ >> struct BSMAE { >> u2 bootstrap_method_index; >> u2 argument_count; >> u2 arguments[argument_count]; >> } >> >> >> We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. >> >> Please consider! >> >> Testing: Currently GHA, running tier1-tier3 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Fix Thank you all for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25298#issuecomment-2995192388 From jsjolen at openjdk.org Mon Jun 23 07:08:37 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 23 Jun 2025 07:08:37 GMT Subject: Integrated: 8357220: Introduce a BSMAttributeEntry struct In-Reply-To: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> References: <4k7ezcDyFuiEKiYMour4OWsFhtwi6by6cuBFFozuc1c=.7a77f4b1-cd8d-4555-841e-f34612f0388f@github.com> Message-ID: On Mon, 19 May 2025 07:35:16 GMT, Johan Sj?len wrote: > Hi, > > The constant pool currently has a lot of methods specific to extracting parts of the operands array. What this array actually is, is a sequence of bootstrap method attribute entries, where each entry has the following components: > > ```c++ > struct BSMAE { > u2 bootstrap_method_index; > u2 argument_count; > u2 arguments[argument_count]; > } > > > We can removes some of these operands array specific methods, and instead allows you to extract BSMAttributeEntrys which you can then use to extract its piece wise components. This makes for a nicer interface, and a bit easier to come into as a reader of the code, as it more closely mirrors the JVMS. > > Please consider! > > Testing: Currently GHA, running tier1-tier3 This pull request has now been integrated. Changeset: 3d35b408 Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/3d35b408e1e69d7e3953af142c5bf606691fbeb8 Stats: 115 lines in 7 files changed: 42 ins; 31 del; 42 mod 8357220: Introduce a BSMAttributeEntry struct Co-authored-by: John R Rose Reviewed-by: sspitsyn, coleenp, matsaave ------------- PR: https://git.openjdk.org/jdk/pull/25298 From bmaillard at openjdk.org Mon Jun 23 07:09:15 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Mon, 23 Jun 2025 07:09:15 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v4] In-Reply-To: References: Message-ID: > This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. > > ### Testing > - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) > - [x] tier1-3, plus some internal testing > - [x] Manual testing with values known to previously cause undefined behavior > > Thanks! Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: 8356865: Add comment for range in test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25834/files - new: https://git.openjdk.org/jdk/pull/25834/files/97f52b45..8241b218 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25834&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25834&range=02-03 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25834.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25834/head:pull/25834 PR: https://git.openjdk.org/jdk/pull/25834 From bmaillard at openjdk.org Mon Jun 23 07:09:15 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Mon, 23 Jun 2025 07:09:15 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v3] In-Reply-To: References: <38Q5LqxUlyIBHyusziq461sTdVCBm0ZO3kcVB_u7I18=.11da0a73-7a4f-4c83-abeb-f12791ad7741@github.com> Message-ID: <5SI82szo_LQNH0Uhl-1-8tN1rISzF9zTGDs4PM7yU9Y=.542ef0d0-1b62-4b3e-9714-b2a61cbf358c@github.com> On Mon, 23 Jun 2025 06:16:54 GMT, Emanuel Peter wrote: >> Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: >> >> 8356865: Change assert > > test/hotspot/jtreg/compiler/arguments/TestFastAllocateSizeLimit.java line 48: > >> 46: public static void main(String[] args) throws IOException { >> 47: if (args.length == 0) { >> 48: int sizeLimit = RANDOM.nextInt(1 << 28); > > Can you please add a quick comment why you chose `1 << 28`? Done, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25834#discussion_r2160871327 From epeter at openjdk.org Mon Jun 23 07:36:30 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 23 Jun 2025 07:36:30 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v4] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 07:09:15 GMT, Beno?t Maillard wrote: >> This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. >> >> ### Testing >> - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) >> - [x] tier1-3, plus some internal testing >> - [x] Manual testing with values known to previously cause undefined behavior >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: > > 8356865: Add comment for range in test Thanks for the updates! Nice work :) ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25834#pullrequestreview-2948981657 From iwalulya at openjdk.org Mon Jun 23 07:40:33 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 23 Jun 2025 07:40:33 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v13] In-Reply-To: <76NgHH-m26Nw2paJmIQvNNqio_iKtdQ_bJ2aejMfKEI=.82ff25aa-f5c5-4146-84b5-1aaaefb5efd1@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <76NgHH-m26Nw2paJmIQvNNqio_iKtdQ_bJ2aejMfKEI=.82ff25aa-f5c5-4146-84b5-1aaaefb5efd1@github.com> Message-ID: On Wed, 18 Jun 2025 09:00:58 GMT, Albert Mingkun Yang wrote: >> This patch refines Parallel's sizing strategy to improve overall memory management and performance. >> >> The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. >> >> `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. >> >> GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. >> >> ## Performance evaluation >> >> - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). >> - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). >> - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. >> >> PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. >> >> Test: tier1-8 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: > > - review > - Merge branch 'master' into pgc-size-policy > - merge > - version > - Merge branch 'master' into pgc-size-policy > - revert-aliases > - Merge branch 'master' into pgc-size-policy > - merge > - merge-fix > - merge > - ... and 9 more: https://git.openjdk.org/jdk/compare/2b94b70e...a21e5363 Do we have other uses of `class AdaptiveSizePolicy` except for parallelGC? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25000#issuecomment-2995286022 From duke at openjdk.org Mon Jun 23 07:40:32 2025 From: duke at openjdk.org (duke) Date: Mon, 23 Jun 2025 07:40:32 GMT Subject: RFR: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB [v4] In-Reply-To: References: Message-ID: <96LNXhcElufKdZY863SlC9Mb3u9tDjjr_-JCRNJjUrw=.47f3f82d-b1a4-4a87-873b-78dbeaa5b1f9@github.com> On Mon, 23 Jun 2025 07:09:15 GMT, Beno?t Maillard wrote: >> This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. >> >> ### Testing >> - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) >> - [x] tier1-3, plus some internal testing >> - [x] Manual testing with values known to previously cause undefined behavior >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: > > 8356865: Add comment for range in test @benoitmaillard Your change (at version 8241b2188b2f8334439f3824fb535ce29091eb37) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25834#issuecomment-2995283731 From bmaillard at openjdk.org Mon Jun 23 07:54:37 2025 From: bmaillard at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Mon, 23 Jun 2025 07:54:37 GMT Subject: Integrated: 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 14:50:46 GMT, Beno?t Maillard wrote: > This PR adds a range constraint for the `-XX:FastAllocateSizeLimit` debug flag. This prevents undefined behavior caused by left-shift overflow of the flag value in `GraphKit::new_array`. > > ### Testing > - [x] [GitHub Actions](https://github.com/benoitmaillard/jdk/actions?query=branch%3AJDK-8356865) > - [x] tier1-3, plus some internal testing > - [x] Manual testing with values known to previously cause undefined behavior > > Thanks! This pull request has now been integrated. Changeset: c220b135 Author: Beno?t Maillard Committer: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/c220b1358c91bce2eb7515e9f600004c7b975ee6 Stats: 64 lines in 3 files changed: 64 ins; 0 del; 0 mod 8356865: C2: Unreasonable values for debug flag FastAllocateSizeLimit can lead to left-shift-overflow, which is UB Reviewed-by: epeter, mhaessig ------------- PR: https://git.openjdk.org/jdk/pull/25834 From tschatzl at openjdk.org Mon Jun 23 08:03:32 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Jun 2025 08:03:32 GMT Subject: RFR: 8359924: Deprecate and obsolete ParallelRefProcEnabled In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 21:26:56 GMT, Kim Barrett wrote: > > Same with `ReferencesPerThread`, again separately, but that does not need a CSR. > > We could remove ReferencesPerThread immediately, without first deprecating, since it's experimental. I would keep it as diagnostic option just in case, but removing this is also an option. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25875#issuecomment-2995361450 From ayang at openjdk.org Mon Jun 23 08:30:38 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Jun 2025 08:30:38 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v13] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <76NgHH-m26Nw2paJmIQvNNqio_iKtdQ_bJ2aejMfKEI=.82ff25aa-f5c5-4146-84b5-1aaaefb5efd1@github.com> Message-ID: On Mon, 23 Jun 2025 07:38:09 GMT, Ivan Walulya wrote: > Do we have other uses of class AdaptiveSizePolicy except for parallelGC? No. A followup cleanup can be to merge this class into its sole subclass, used by Parallel. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25000#issuecomment-2995445820 From ayang at openjdk.org Mon Jun 23 08:30:43 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Jun 2025 08:30:43 GMT Subject: RFR: 8359924: Deprecate and obsolete ParallelRefProcEnabled In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:04:28 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcEnabled`, which is used only by Parallel and G1, and both have it enabled by default via: > > > if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > 1) { > FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); > } > > > Disabling it offers little benefit and its presence incurs some implementation complexity in the reference-processor. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25875#issuecomment-2995440116 From ayang at openjdk.org Mon Jun 23 08:30:44 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Jun 2025 08:30:44 GMT Subject: Integrated: 8359924: Deprecate and obsolete ParallelRefProcEnabled In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:04:28 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcEnabled`, which is used only by Parallel and G1, and both have it enabled by default via: > > > if (FLAG_IS_DEFAULT(ParallelRefProcEnabled) && ParallelGCThreads > 1) { > FLAG_SET_DEFAULT(ParallelRefProcEnabled, true); > } > > > Disabling it offers little benefit and its presence incurs some implementation complexity in the reference-processor. This pull request has now been integrated. Changeset: 516197f5 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/516197f50b079978a4aa1593744cef9d56e01c67 Stats: 19 lines in 3 files changed: 10 ins; 8 del; 1 mod 8359924: Deprecate and obsolete ParallelRefProcEnabled Reviewed-by: tschatzl, kbarrett, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25875 From shade at openjdk.org Mon Jun 23 08:32:34 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 23 Jun 2025 08:32:34 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: <4QtV9UGiRdP0LG6rIHH_4fwO2f5wm0evnHnut2nxvjM=.ecf61194-7201-4c7e-a496-3ddc821ea079@github.com> References: <4QtV9UGiRdP0LG6rIHH_4fwO2f5wm0evnHnut2nxvjM=.ecf61194-7201-4c7e-a496-3ddc821ea079@github.com> Message-ID: On Thu, 19 Jun 2025 10:02:45 GMT, Aleksey Shipilev wrote: >> Thank you for the backport! @shipilev indicated that the backport to 21 should wait a bit, could you clarify when should I file that (e.g. end of July, ...)? > >> @shipilev indicated that the backport to 21 should wait a bit, could you clarify when should I file that (e.g. end of July, ...)? > > I would say for the fairly big change like this, we want to wait until JDK 25 GA (that would pass the all-tests-run). It would be too late for Oct 2025 release, though. So realistically, this would target January 2026 release. You can pull this patch to your downstream JDK 21 to see if there are any troubles ahead of this path, this will also soothe 21u maintainer concerns, I think. > So @shipilev, JDK 25 is in rampdown phase 1. Rampdown phase 2 starts 7/17. It's a big change for JDK 25. My sense of the risk is that it's not high. If it's checked in now, I think it'll get more testing than if it gets backported to a 25.01 release. I'll discuss it with people here. I think for JDK 21, we should wait for more testing to be run after 25 GA. Yeah, I am on the fence for JDK 25, can take it both ways. If it is to land in JDK 25u anyway, there is some benefit for doing it in GA for more testing. At the same time, JDK 26 is already testing it, so there is no particular need to risk JDK 25 stability, and we can just capitalize on JDK 26 testing for JDK 25u backport. For JDK 21u, it definitely needs to be released in JDK 25 first. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-2995450240 From ayang at openjdk.org Mon Jun 23 08:34:22 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Jun 2025 08:34:22 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v14] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <9oCyQapT5zkgtiWmLQoPBY10EUD6Q4LIEO4Sr6nyxXI=.963bc30b-a996-4c5a-9594-16c36c6c70db@github.com> > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - merge - version - Merge branch 'master' into pgc-size-policy - revert-aliases - Merge branch 'master' into pgc-size-policy - merge - merge-fix - ... and 10 more: https://git.openjdk.org/jdk/compare/516197f5...41027bdf ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=13 Stats: 4371 lines in 31 files changed: 520 ins; 3452 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From ayang at openjdk.org Mon Jun 23 09:20:42 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 23 Jun 2025 09:20:42 GMT Subject: RFR: 8360220: Deprecate and obsolete ParallelRefProcBalancingEnabled Message-ID: Deprecating `ParallelRefProcBalancingEnabled`, which is used only by Parallel and G1, and both have it enabled by default. Disabling it offers little benefit, so removing it do reduce the number commandline flags. ------------- Commit messages: - deprecate-balance Changes: https://git.openjdk.org/jdk/pull/25932/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25932&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360220 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25932.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25932/head:pull/25932 PR: https://git.openjdk.org/jdk/pull/25932 From tschatzl at openjdk.org Mon Jun 23 10:13:36 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Jun 2025 10:13:36 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4] In-Reply-To: References: Message-ID: On Sun, 22 Jun 2025 22:15:12 GMT, Kim Barrett wrote: >> Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: >> >> Albert suggestions > > src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 302: > >> 300: if (_g1h->capacity() == _g1h->max_capacity()) { >> 301: log_resize(short_term_pause_time_ratio, long_term_pause_time_ratio, >> 302: lower_threshold, upper_threshold, pause_time_threshold, true, 0, expand); > > We're on the expand-side of things, but `expand` hasn't been set to true yet. > Similarly, on the shrink side, we'll be passing in the default initial value for `expand` and > then setting it later. Maybe `expand` should be set earlier on each of the branches. It does not matter for this case, ideally the log message would not print the information about whether we expanded or not (because we did not). Maybe you are right, we were trying to expand at least (and just pass `true` here?). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161205385 From tschatzl at openjdk.org Mon Jun 23 10:13:35 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Jun 2025 10:13:35 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 12:26:15 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Albert suggestions Some minor nits I think. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 61: > 59: > 60: assert(G1ShortTermShrinkThreshold <= long_term_count_limit(), > 61: "Shrink threshold count must be less than %u", long_term_count_limit()); I would prefer if these would be part of argument processing. I see that below we just ignore what the user specified for `G1ShortTermShrinkThreshold` if it is too high when determining `ThresholdForShrink`, but I would prefer to just fail early if the diagnostic option is out of bounds. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 225: > 223: // - lower threshold, we do not want to go under. > 224: // - mid threshold, halfway between upper and lower threshold, represents the > 225: // actual target when resizing the heap. Suggestion: // - actual pause time threshold, halfway between upper and lower threshold, represents the // actual target when resizing the heap. ("mid-threshold" is some renmant of the original change) src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 335: > 333: // A resize has not been triggered, but the long term counter overflowed. > 334: decay_ratio_tracking_data(); > 335: expand = true; // Does not matter. Maybe we should return `expand = false` in the cases where `resize_bytes == 0` always? maybe in the log printing below we should suppress the `expand: ` part if `resize_bytes == 0` too. src/hotspot/share/gc/g1/g1_globals.hpp line 162: > 160: product(uint, G1ExpandByPercentOfAvailable, 20, EXPERIMENTAL, \ > 161: "When expanding, % of uncommitted space to claim.") \ > 162: range(0, 100) \ Suggestion: "When expanding, % of uncommitted space to expand the heap by in a single expand attempt.") \ range(0, 100) \ src/hotspot/share/gc/g1/g1_globals.hpp line 166: > 164: product(uint, G1ShrinkByPercentOfAvailable, 50, EXPERIMENTAL, \ > 165: "When shrinking, maximum % of free space to claim.") \ > 166: range(0, 100) \ Suggestion: "When shrinking, maximum % of free space to free for a single shrink attempt.") \ range(0, 100) \ src/hotspot/share/gc/g1/g1_globals.hpp line 169: > 167: \ > 168: product(uint, G1MinimumPercentOfGCTimeRatio, 25, EXPERIMENTAL, \ > 169: "Percentage of GCTimeRatio G1 will try to avoid going below.") \ Suggestion: "Determines lower and upper thresholds as percentage of GCTimeRatio. G1 compares these thresholds against the current gc cpu usage (gc time ratio?) to register too low or too high cpu usage events for heap resizing.") \ src/hotspot/share/gc/g1/g1_globals.hpp line 174: > 172: product(uint, G1ShortTermShrinkThreshold, 8, EXPERIMENTAL, \ > 173: "Number of consecutive GCs with the short term gc time ratio" \ > 174: "below the threshold before we attempt to shrink.") \ I think the description is somewhat confusing. Suggestion: "Number of consecutive GCs where the current gc time ratio" \ "below the lower threshold before G1 attempts to shrink.") \ src/hotspot/share/gc/g1/g1_globals.hpp line 176: > 174: "below the threshold before we attempt to shrink.") \ > 175: range(0, 10) \ > 176: \ I would make them diagnostic, not experimental. They might need to be tweaked, and it's better to provide them as diagnostic than experimental. Just sounds more "safe" to use for end users when providing them. It's not like they trigger potentially unstable features. Also, for symmetry I think we should provide a `G1ShortTermExpandThreshold` as well instead of the constant `MinOverThresholdForExpansion` embedded in the code. ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25832#pullrequestreview-2949374369 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161197065 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161162327 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161208940 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161191299 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161188405 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161184827 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161187759 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161179316 From stefank at openjdk.org Mon Jun 23 11:16:34 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 23 Jun 2025 11:16:34 GMT Subject: RFR: 8359923: Const accessors for the Deferred class In-Reply-To: <0yHG618NALvVK84ex3klj-5PIFJ7bMMcAhSH1Zlgw2c=.321601c8-926b-4c82-a401-d3b2504b4018@github.com> References: <_F_xmX17xbS4DlqUOj8L-zrzeXZpsNiPFsN3_bnqy48=.4380549b-ec98-43dd-ad72-4e1b5a64df64@github.com> <0yHG618NALvVK84ex3klj-5PIFJ7bMMcAhSH1Zlgw2c=.321601c8-926b-4c82-a401-d3b2504b4018@github.com> Message-ID: On Fri, 20 Jun 2025 16:24:02 GMT, Kim Barrett wrote: > I've asked that this change be backed out. I don't think this change should have been made. I've proposed a better solution for the use-case that led to making this change. The name Deferred was chosen when we thought that we would be allowed to use this utility to defer initialization for more situations than just static objects. If we are going to lock this down to only use it for static objects, then I think we also should rename the utility accordingly. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25874#issuecomment-2996006837 From coleenp at openjdk.org Mon Jun 23 12:06:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 12:06:03 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v5] In-Reply-To: References: Message-ID: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with three additional commits since the last revision: - Update src/hotspot/share/prims/jvmtiRedefineClasses.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> - Update src/hotspot/share/classfile/stackMapTable.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> - Update src/hotspot/share/classfile/stackMapTable.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25870/files - new: https://git.openjdk.org/jdk/pull/25870/files/7d2e60c2..edec7b8e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25870&range=03-04 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25870.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25870/head:pull/25870 PR: https://git.openjdk.org/jdk/pull/25870 From coleenp at openjdk.org Mon Jun 23 12:06:03 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 12:06:03 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v5] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:02:43 GMT, Coleen Phillimore wrote: >> This uses names for frame types for stackmaps in the verifier and redefinition. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/prims/jvmtiRedefineClasses.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/classfile/stackMapTable.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/classfile/stackMapTable.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> Yes, I like all of these changes. One would have prevented me from a transient bug I had. @matias9927 and I chatted about restructuring this code in the future which seems like a good idea. All in all, these tags and your suggestions are really helpful. Also verified and am testing your suggested changes. ------------- PR Review: https://git.openjdk.org/jdk/pull/25870#pullrequestreview-2949819071 From kbarrett at openjdk.org Mon Jun 23 12:14:30 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 23 Jun 2025 12:14:30 GMT Subject: RFR: 8360220: Deprecate and obsolete ParallelRefProcBalancingEnabled In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 09:14:43 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcBalancingEnabled`, which is used only by Parallel and G1, and both have it enabled by default. > > Disabling it offers little benefit, so removing it do reduce the number commandline flags. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25932#pullrequestreview-2949857471 From coleenp at openjdk.org Mon Jun 23 12:19:32 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 12:19:32 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test I'm not sure but I think in this instance, the native caller has the mirror for the class where it gets the jmethodID from, so can't unload the Method. JNI_ENTRY(ResultType, \ jni_CallStatic##Result##Method(JNIEnv *env, jclass cls, jmethodID methodID, ...)) \ \ ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2996270175 From coleenp at openjdk.org Mon Jun 23 12:29:34 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 12:29:34 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v10] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Fri, 20 Jun 2025 21:31:13 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files Please remove the big comment about code we could implement here. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2949904274 From coleenp at openjdk.org Mon Jun 23 12:29:35 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 12:29:35 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v10] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Mon, 23 Jun 2025 06:06:15 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files > > src/hotspot/share/utilities/exceptions.cpp line 637: > >> 635: if (!method->is_native() && (Bytecodes::Code) *method->bcp_from(bci) == Bytecodes::_athrow) { >> 636: // TODO: it would be nice to filter out exceptions re-thrown by finally blocks (which include >> 637: // try-with-resource statements): > > This mega-comment really doesn't belong here. Please ensure this discussion is in JBS and just use a short comment here e.g. > > // TODO: try to find a way to avoid repeated stacktraces when an exception gets re-thrown by a finally block Thank you - I made this comment to Ioi privately and would really like to see it removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2161501057 From tschatzl at openjdk.org Mon Jun 23 12:49:30 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 23 Jun 2025 12:49:30 GMT Subject: RFR: 8360220: Deprecate and obsolete ParallelRefProcBalancingEnabled In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 09:14:43 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcBalancingEnabled`, which is used only by Parallel and G1, and both have it enabled by default. > > Disabling it offers little benefit, so removing it do reduce the number commandline flags. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25932#pullrequestreview-2949965234 From eosterlund at openjdk.org Mon Jun 23 12:49:41 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 23 Jun 2025 12:49:41 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v8] In-Reply-To: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> References: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> Message-ID: On Tue, 17 Jun 2025 20:59:46 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > 2nd try at arm fix Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2949966322 From eosterlund at openjdk.org Mon Jun 23 12:49:41 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 23 Jun 2025 12:49:41 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 00:20:05 GMT, Dean Long wrote: >> Well, yeah sort of. And hence the comment that it's probably fine in terms of correctness. They were also a bit more independent systems then though. Just thought that if we now take the step to merge compiler and GC entry trap mechanisms into the nmethod entry barrier, that we could seemingly also make it a bit less slippery here and establish some sort of invariant that if we while holding the lock protecting the entry barrier find that the nmethod entry barrier is not entrant, for whatever reason, we should not enter it. Would make it easier to understand the code I suspect. What do you think? > > I think making it less slippery in one place but still leaving other races gives a false sense of security and makes the code harder to understand. Arming the barrier is not guaranteed to be visible until there is a safepoint. Note that AArch64 and RISCV only call increment_patching_epoch() when the guard value is set to the disarmed value, so there is no invalidation of the CPU pipeline or instruction buffer (cross modification fence) when arming. Okay. I would have preferred to not enter the nmethod when we evaluate the guard bits under the lock that protects it and see that it's supposed to be not entrant. But I won't argue for it further if you prefer not to change that. Other than that, I think this looks good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2161540372 From kbarrett at openjdk.org Mon Jun 23 13:01:18 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 23 Jun 2025 13:01:18 GMT Subject: RFR: 8360281: VMError::error_string has incorrect format usage Message-ID: Please review this trivial change to VMError::error_string, applying the `p2i` helper function to a pointer being used as the value for a PTR_FORMAT directive. Testing: mach5 tier1 Locally tested with gcc printf warnings enabled for jio_snprintf, and verified the warning for the changed call is no longer present. ------------- Commit messages: - fix printing in VMError::error_string Changes: https://git.openjdk.org/jdk/pull/25935/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25935&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360281 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25935.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25935/head:pull/25935 PR: https://git.openjdk.org/jdk/pull/25935 From duke at openjdk.org Mon Jun 23 14:45:44 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 23 Jun 2025 14:45:44 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag Message-ID: This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). Lightweight locking is the default locking from now on. Tested in tiers 1 - 7. ------------- Commit messages: - Merge remote-tracking branch 'origin/master' into 8359437_locking_mode_obsoletion - Merge remote-tracking branch 'origin/master' into 8359437_locking_mode_obsoletion - 8359437: Addressed reviewers' comments - Merge remote-tracking branch 'origin/master' into 8359437_locking_mode_obsoletion - Merge remote-tracking branch 'origin/master' into 8359437_locking_mode_obsoletion - Merge branch 'master' into _remove_locking_mode_fix_tests - Update after pre-review - First try. Changes: https://git.openjdk.org/jdk/pull/25847/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359437 Stats: 1122 lines in 33 files changed: 40 ins; 992 del; 90 mod Patch: https://git.openjdk.org/jdk/pull/25847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25847/head:pull/25847 PR: https://git.openjdk.org/jdk/pull/25847 From dholmes at openjdk.org Mon Jun 23 14:45:46 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 23 Jun 2025 14:45:46 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:39:49 GMT, Anton Artemov wrote: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. I've taken an initial pass through. Initially I misunderstood the strategy with heavy monitors - see comments below. src/hotspot/share/runtime/arguments.cpp line 1839: > 1837: #ifndef _LP64 > 1838: if (LockingMode == LM_LEGACY) { > 1839: LockingMode = LM_LIGHTWEIGHT; If we have prevented the locking mode from being set then surely we can never encounter this case? src/hotspot/share/utilities/globalDefinitions.cpp line 59: > 57: uint64_t OopEncodingHeapMax = 0; > 58: > 59: int LockingMode = LM_LIGHTWEIGHT; const ? src/hotspot/share/utilities/globalDefinitions.hpp line 1012: > 1010: }; > 1011: > 1012: extern int LockingMode; const ? test/hotspot/jtreg/runtime/Monitor/StressWrapper_TestRecursiveLocking_36M.java line 36: > 34: * -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI > 35: * -Xint > 36: * -XX:LockingMode=0 I was wondering why these LockingMode=0 test cases were not setting `VerifyHeavyMonitors` instead, but I'm assuming the intent now is that we will only test that mode when it is set externally by the user (or in our case a particular test task definition)? I also realized we can only test heavy monitors in tests where we explicitly control the monitor creation places and hence can call the WB method to force inflation. That obviously reduces the test coverage for that mode quite significantly - but perhaps that will be handled if in the future we implicitly reenable forced inflation and do away with the WB usage. test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 1: > 1: /* This seems to remove significant test coverage. can we not adapt the tests to not rely on logging warnings that will no longer be present? ------------- PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2938076106 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2153873165 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2153924585 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2153924946 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2153884907 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2153911036 From coleenp at openjdk.org Mon Jun 23 14:45:47 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 14:45:47 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:39:49 GMT, Anton Artemov wrote: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. I have a request for some more deieting. This looks great. Thank you! test/hotspot/jtreg/runtime/Monitor/TestRecursiveLocking.java line 125: > 123: public class TestRecursiveLocking { > 124: static final WhiteBox WB = WhiteBox.getWhiteBox(); > 125: static final boolean flagHeavyMonitors = WB.getBooleanVMFlag("VerifyHeavyMonitors"); I think you should take out the VerifyHeavyMonitors cases. @fbredber originally had that flag turn on a the reintroduced UseHeavyMonitors option but the UseHeavyMonitors option doesn't actually do that with this change. I don't think this test will pass with -XX:+VerifyHeavyMonitors. If we reintroduce UseHeavyMonitors, save this diff and fix this test then. Right now it's not correct. test/hotspot/jtreg/runtime/lockStack/TestLockStackCapacity.java line 42: > 40: public class TestLockStackCapacity { > 41: static final WhiteBox WB = WhiteBox.getWhiteBox(); > 42: static final boolean flagHeavyMonitors = WB.getBooleanVMFlag("VerifyHeavyMonitors"); I think this should also not have cases for VerifyHeavyMonitors. We can add back tests if we want UseHeavyMonitors. As of now, removing the Legacy locking code will remove code that reaches the VerifyHeavyMonitors branches. test/jtreg-ext/requires/VMProps.java line 424: > 422: * Note: Lightweight locking does not support RTM (for now). > 423: */ > 424: protected String vmRTMCompiler() { There's an issue to remove this function since it's now unused. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2940262141 Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2949835857 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2155234150 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2155246651 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2155256340 From duke at openjdk.org Mon Jun 23 14:45:47 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 23 Jun 2025 14:45:47 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 07:37:31 GMT, David Holmes wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > src/hotspot/share/runtime/arguments.cpp line 1839: > >> 1837: #ifndef _LP64 >> 1838: if (LockingMode == LM_LEGACY) { >> 1839: LockingMode = LM_LIGHTWEIGHT; > > If we have prevented the locking mode from being set then surely we can never encounter this case? Looks like yes, this whole check then can be removed. Addressed in the latest commit. > src/hotspot/share/utilities/globalDefinitions.cpp line 59: > >> 57: uint64_t OopEncodingHeapMax = 0; >> 58: >> 59: int LockingMode = LM_LIGHTWEIGHT; > > const ? This can be done provided that one removes assignment on line 3763 in arguments.cpp. That assignment looks redundant as LockingMode is always LM_LIGHTWEIGHT from now on. > src/hotspot/share/utilities/globalDefinitions.hpp line 1012: > >> 1010: }; >> 1011: >> 1012: extern int LockingMode; > > const ? This can be done provided that one removes assignment on line 3763 in arguments.cpp. That assignment looks redundant as LockingMode is always LM_LIGHTWEIGHT from now on. > test/hotspot/jtreg/runtime/Monitor/StressWrapper_TestRecursiveLocking_36M.java line 36: > >> 34: * -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI >> 35: * -Xint >> 36: * -XX:LockingMode=0 > > I was wondering why these LockingMode=0 test cases were not setting `VerifyHeavyMonitors` instead, but I'm assuming the intent now is that we will only test that mode when it is set externally by the user (or in our case a particular test task definition)? > > I also realized we can only test heavy monitors in tests where we explicitly control the monitor creation places and hence can call the WB method to force inflation. That obviously reduces the test coverage for that mode quite significantly - but perhaps that will be handled if in the future we implicitly reenable forced inflation and do away with the WB usage. My understanding is that VerifyHeavyMonitors requires LockingMode = 0, see line 1852 of arguments.cpp. So one has to set both at the same time, not one instead of another. Now locking mode is hardcoded to lightweight, and there is no way to use the incompatible `VerifyHeavyMonitors` option. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161284370 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161120254 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161119703 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161284143 From duke at openjdk.org Mon Jun 23 14:45:47 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 23 Jun 2025 14:45:47 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 18:23:58 GMT, Coleen Phillimore wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > test/hotspot/jtreg/runtime/Monitor/TestRecursiveLocking.java line 125: > >> 123: public class TestRecursiveLocking { >> 124: static final WhiteBox WB = WhiteBox.getWhiteBox(); >> 125: static final boolean flagHeavyMonitors = WB.getBooleanVMFlag("VerifyHeavyMonitors"); > > I think you should take out the VerifyHeavyMonitors cases. @fbredber originally had that flag turn on a the reintroduced UseHeavyMonitors option but the UseHeavyMonitors option doesn't actually do that with this change. I don't think this test will pass with -XX:+VerifyHeavyMonitors. > If we reintroduce UseHeavyMonitors, save this diff and fix this test then. Right now it's not correct. Removed in the latest commit. > test/hotspot/jtreg/runtime/lockStack/TestLockStackCapacity.java line 42: > >> 40: public class TestLockStackCapacity { >> 41: static final WhiteBox WB = WhiteBox.getWhiteBox(); >> 42: static final boolean flagHeavyMonitors = WB.getBooleanVMFlag("VerifyHeavyMonitors"); > > I think this should also not have cases for VerifyHeavyMonitors. We can add back tests if we want UseHeavyMonitors. As of now, removing the Legacy locking code will remove code that reaches the VerifyHeavyMonitors branches. Removed in the latest commit. > test/jtreg-ext/requires/VMProps.java line 424: > >> 422: * Note: Lightweight locking does not support RTM (for now). >> 423: */ >> 424: protected String vmRTMCompiler() { > > There's an issue to remove this function since it's now unused. Removed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161282585 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161282345 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161282101 From coleenp at openjdk.org Mon Jun 23 14:45:47 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 14:45:47 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 07:52:02 GMT, David Holmes wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 1: > >> 1: /* > > This seems to remove significant test coverage. can we not adapt the tests to not rely on logging warnings that will no longer be present? The premise of this test is now invalid. We could write a fresh new test if we'd like to see what happens with UseHeavyMonitors, and/or retrieve this from git history. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2155250065 From coleenp at openjdk.org Mon Jun 23 15:39:27 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 23 Jun 2025 15:39:27 GMT Subject: RFR: 8360281: VMError::error_string has incorrect format usage In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:55:24 GMT, Kim Barrett wrote: > Please review this trivial change to VMError::error_string, applying the `p2i` > helper function to a pointer being used as the value for a PTR_FORMAT directive. > > Testing: mach5 tier1 > Locally tested with gcc printf warnings enabled for jio_snprintf, and verified > the warning for the changed call is no longer present. Ok, looks trivial. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25935#pullrequestreview-2950586111 From alanb at openjdk.org Mon Jun 23 15:51:34 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 23 Jun 2025 15:51:34 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:39:49 GMT, Anton Artemov wrote: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. test/jdk/java/lang/Thread/virtual/Parking.java line 388: > 386: @ParameterizedTest > 387: @ValueSource(booleans = { true, false }) > 388: @DisabledIf("LockingMode#isLegacy") Would you mind checking if the import DisabledIf can be removed from these tests? I think we only used it to conditionally run when not legacy mode. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2161946231 From lmesnik at openjdk.org Mon Jun 23 17:09:32 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 23 Jun 2025 17:09:32 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:39:49 GMT, Anton Artemov wrote: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Marked as reviewed by lmesnik (Reviewer). Sorry, I mistakenly approved the fix instead fo request changes. Please find my change requests in previous comments. test/hotspot/jtreg/gtest/LockStackGtests.java line 26: > 24: > 25: /* @test > 26: * @summary Run LockStack gtests with LockingMode=2 All gtests are executed with default vm flags in GTestWrapper.java so this while test should be just removed. test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 37: > 35: * @bug 8318757 > 36: * @summary Test concurrent monitor deflation by MonitorDeflationThread and thread dumping > 37: * @library /test/lib / The '/' shouldn't be required for whitebox. Could you please remove it. ------------- PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2950786910 Changes requested by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2950827379 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2162077784 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2162055605 From stefank at openjdk.org Mon Jun 23 18:06:28 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 23 Jun 2025 18:06:28 GMT Subject: RFR: 8360023: Add an insertion sort implementation to Hotspot [v4] In-Reply-To: References: <7-Pyowek_M5CSJRJQV73o6SJ8GoHKVuDz0pFjqxVjAg=.6ba9b804-39aa-4379-b689-f55fda4dd17e@github.com> <9rcA-0uAuVwpK8WPTXdFCmcBZhDTNB-KtTa0NecLKZk=.369e3dcc-6f80-400e-887c-837e78a8a19f@github.com> <2-6lbSM0y22WVEiOqLJ31lu8LkA-Ik1O4nr6eb1vpoo=.d87f5b56-7113-42a7-962a-94eb3c2ac1c7@github.com> Message-ID: On Fri, 20 Jun 2025 10:58:23 GMT, Quan Anh Mai wrote: >> Sort the "#include" lines alphabetically. > > I assume you want to have `unittest.hpp` above the `utilities` files. I have done that. I was confused because the convention in this area is pretty blurry as many files have the `unittest.hpp` as their last include. The includes in our gtests are a bit of a mess because there's a requirement to put unittest.hpp last, but helper files tend to also include unittest.hpp and breaking that rule. I would suggest that you sort the normal hotspot includes and put the test includes (none here) last. You also need to fix: to be "utilities/globalDefinitions.hpp" for all HotSpot header files. In case you haven't seen it, here's the style guide for HotSpot includes: https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md#source-files and here's a tool you can run on your non-gtest HotSpot files: https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/sources/SortIncludes.java ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25895#discussion_r2162184028 From kbarrett at openjdk.org Mon Jun 23 18:22:31 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 23 Jun 2025 18:22:31 GMT Subject: RFR: 8360281: VMError::error_string has incorrect format usage In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 15:36:25 GMT, Coleen Phillimore wrote: >> Please review this trivial change to VMError::error_string, applying the `p2i` >> helper function to a pointer being used as the value for a PTR_FORMAT directive. >> >> Testing: mach5 tier1 >> Locally tested with gcc printf warnings enabled for jio_snprintf, and verified >> the warning for the changed call is no longer present. > > Ok, looks trivial. Thanks @coleenp ------------- PR Comment: https://git.openjdk.org/jdk/pull/25935#issuecomment-2997454580 From stefank at openjdk.org Mon Jun 23 18:22:32 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 23 Jun 2025 18:22:32 GMT Subject: RFR: 8360281: VMError::error_string has incorrect format usage In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:55:24 GMT, Kim Barrett wrote: > Please review this trivial change to VMError::error_string, applying the `p2i` > helper function to a pointer being used as the value for a PTR_FORMAT directive. > > Testing: mach5 tier1 > Locally tested with gcc printf warnings enabled for jio_snprintf, and verified > the warning for the changed call is no longer present. But why didn't the compiler complain about this? If we send this format string and args through UL we do get an error: log_info(gc)("%s (0x%x) at pc=" PTR_FORMAT ", pid=%d, tid=%zu", signame, _id, _pc, os::current_process_id(), os::current_thread_id()); /Users/stefank/git/jdk/open/src/hotspot/share/utilities/vmError.cpp:268:32: error: format specifies type 'unsigned long' but the argument has type 'address' (aka 'unsigned char *') [-Werror,-Wformat] signame, _id, _pc, ------------- PR Comment: https://git.openjdk.org/jdk/pull/25935#issuecomment-2997462691 From kbarrett at openjdk.org Mon Jun 23 18:22:33 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 23 Jun 2025 18:22:33 GMT Subject: Integrated: 8360281: VMError::error_string has incorrect format usage In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:55:24 GMT, Kim Barrett wrote: > Please review this trivial change to VMError::error_string, applying the `p2i` > helper function to a pointer being used as the value for a PTR_FORMAT directive. > > Testing: mach5 tier1 > Locally tested with gcc printf warnings enabled for jio_snprintf, and verified > the warning for the changed call is no longer present. This pull request has now been integrated. Changeset: 6df0f5e3 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/6df0f5e390ecf874c1eca7284c51efa65ce23737 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8360281: VMError::error_string has incorrect format usage Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/jdk/pull/25935 From kbarrett at openjdk.org Mon Jun 23 18:26:34 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 23 Jun 2025 18:26:34 GMT Subject: RFR: 8360281: VMError::error_string has incorrect format usage In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 18:19:32 GMT, Stefan Karlsson wrote: > But why didn't the compiler complain about this? If we send this format string and args through UL we do get an error: > > ``` > log_info(gc)("%s (0x%x) at pc=" PTR_FORMAT ", pid=%d, tid=%zu", > signame, _id, _pc, > os::current_process_id(), os::current_thread_id()); > ``` > > ``` > /Users/stefank/git/jdk/open/src/hotspot/share/utilities/vmError.cpp:268:32: error: format specifies type 'unsigned long' but the argument has type 'address' (aka 'unsigned char *') [-Werror,-Wformat] > signame, _id, _pc, > ``` Because of https://bugs.openjdk.org/browse/JDK-8198918 jio_snprintf and friends are not checked by -Wformat ------------- PR Comment: https://git.openjdk.org/jdk/pull/25935#issuecomment-2997488351 From stefank at openjdk.org Mon Jun 23 18:30:31 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 23 Jun 2025 18:30:31 GMT Subject: RFR: 8360281: VMError::error_string has incorrect format usage In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:55:24 GMT, Kim Barrett wrote: > Please review this trivial change to VMError::error_string, applying the `p2i` > helper function to a pointer being used as the value for a PTR_FORMAT directive. > > Testing: mach5 tier1 > Locally tested with gcc printf warnings enabled for jio_snprintf, and verified > the warning for the changed call is no longer present. Ouch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25935#issuecomment-2997516062 From dlong at openjdk.org Mon Jun 23 19:12:29 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 23 Jun 2025 19:12:29 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:47:07 GMT, Erik ?sterlund wrote: >> I think making it less slippery in one place but still leaving other races gives a false sense of security and makes the code harder to understand. Arming the barrier is not guaranteed to be visible until there is a safepoint. Note that AArch64 and RISCV only call increment_patching_epoch() when the guard value is set to the disarmed value, so there is no invalidation of the CPU pipeline or instruction buffer (cross modification fence) when arming. > > Okay. I would have preferred to not enter the nmethod when we evaluate the guard bits under the lock that protects it and see that it's supposed to be not entrant. But I won't argue for it further if you prefer not to change that. Other than that, I think this looks good. I think it's OK if there is a race to have a point of no return, and if one thread gets there first, it wins, and we don't need to check again. It's tempting to want to do an extra check when we disarm under the lock, but then it would need a comment explaining why we do it, even though the make_not_entrant could come in right after and we would miss it. And we have already done the work of healing the oops by this point. Finally, I like the encapsulation that only nmethod_stub_entry_barrier needs to know about not_entrant, and nmethod_entry_barrier doesn't need to know. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2162359865 From dlong at openjdk.org Mon Jun 23 19:18:31 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 23 Jun 2025 19:18:31 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v8] In-Reply-To: References: <2gdPUDg-i40xRoO8LZPWszL2-oa6s7GNZEDRfHfDk8s=.4dbfe74a-070b-46c1-b73d-0292824d02e9@github.com> Message-ID: On Fri, 20 Jun 2025 08:58:39 GMT, Martin Doerr wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> 2nd try at arm fix > >> > Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance. >> >> For ZGC I am using a per-nmethod lock: ZLocker locker(ZNMethod::lock_for_nmethod(nm)); > > Ah, right. So, ZGC should be fine. > >> I don't know what benchmarks to run to check the performance for functions like Deoptimization::deoptimize_all_marked, so I welcome any help with this. > > I have tried some SPEC benchmarks with G1 on PPC64, but couldn't observe a regression. (If there is one, it was below noise.) > >> One possible optimization that might help is skipping the lock if the make_not_entrant call is done during a safepoint. > > I guess the most critical scenario is when many Java threads need to disarm a large number of nmethod entry barriers. That doesn't happen at a safepoint. Not sure if other scenarios are worth optimizing by this idea. > > I guess this PR is ok as it is. Maybe other reviewers have more comments. @TheRealMDoerr @fisk @offamitkumar Thanks again everyone for the reviews and contributions. I think this is ready to integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2997672327 From dlong at openjdk.org Mon Jun 23 19:26:11 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 23 Jun 2025 19:26:11 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: - Merge branch 'master' into 8358821-patch-verified-entry - 2nd try at arm fix - rename arm_with to guard_with - arm32 fix - s390 fix courtesy of Amit Kumar - remove is_sigill_not_entrant - more cleanup - more TheRealMDoerr suggestions - TheRealMDoerr suggestions - remove trailing space - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c ------------- Changes: https://git.openjdk.org/jdk/pull/25764/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25764&range=08 Stats: 603 lines in 43 files changed: 97 ins; 459 del; 47 mod Patch: https://git.openjdk.org/jdk/pull/25764.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764 PR: https://git.openjdk.org/jdk/pull/25764 From mdoerr at openjdk.org Mon Jun 23 19:26:11 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 23 Jun 2025 19:26:11 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: <3NeAP1nLxZGaY1zq8RfpkCy3L3KYXNyjN6Eg1vS1DRE=.e43394f9-33f6-48af-937c-48062b6b8125@github.com> On Mon, 23 Jun 2025 19:22:45 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into 8358821-patch-verified-entry > - 2nd try at arm fix > - rename arm_with to guard_with > - arm32 fix > - s390 fix courtesy of Amit Kumar > - remove is_sigill_not_entrant > - more cleanup > - more TheRealMDoerr suggestions > - TheRealMDoerr suggestions > - remove trailing space > - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c Marked as reviewed by mdoerr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2951288332 From dcubed at openjdk.org Tue Jun 24 00:19:29 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 24 Jun 2025 00:19:29 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Thumbs up. Reviewed the changes between v02 and v11 and the changed look good. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2951838637 From dcubed at openjdk.org Tue Jun 24 00:29:29 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 24 Jun 2025 00:29:29 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test > > I think this should prevent this race but I'd have to think about it some more. > > Aside from the fact the `checked` method is not used much, the problem is > that if the caller does not have something keeping the class alive, then the > esolution of the jMethodID can succeed and we will proceed with trying to > call the method. In the meantime the fact the class is unreferenced could be > noticed and the class then unloaded. Now that can only happen at safepoints, > so it then depends on the details of the code that tries to invoke the method > e.g. in jni.cpp Consider this code from above: > static void jni_invoke_static(JNIEnv *env, JavaValue* result, jobject receiver, JNICallType call_type, jmethodID method_id, JNI_ArgumentPusher *args, TRAPS) { > methodHandle method(THREAD, Method::resolve_jmethod_id(method_id)); Once `Method::resolve_jmethod_id(method_id)` returns to the caller, how can the class be unreferenced? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2998367651 From dholmes at openjdk.org Tue Jun 24 02:38:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 02:38:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:17:08 GMT, Coleen Phillimore wrote: > I'm not sure but I think in this instance, the native caller has the mirror for the class where it gets the jmethodID from, so can't unload the Method. > > ``` > JNI_ENTRY(ResultType, \ > jni_CallStatic##Result##Method(JNIEnv *env, jclass cls, jmethodID methodID, ...)) \ > \ > ``` @coleenp Take a look a the complete code: JNI_ENTRY(ResultType, \ jni_CallStatic##Result##Method(JNIEnv *env, jclass cls, jmethodID methodID, ...)) \ \ EntryProbe; \ ResultType ret{}; \ DT_RETURN_MARK_FOR(Result, CallStatic##Result##Method, ResultType, \ (const ResultType&)ret);\ \ va_list args; \ va_start(args, methodID); \ JavaValue jvalue(Tag); \ JNI_ArgumentPusherVaArg ap(methodID, args); \ jni_invoke_static(env, &jvalue, nullptr, JNI_STATIC, methodID, &ap, CHECK_(ResultType{})); \ va_end(args); \ ret = jvalue.get_##ResultType(); \ return ret;\ JNI_END the `cls` parameter is never actually used. So while it is supposed to refer to the class you have the static method jMethodID for, there is no requirement that it actually does, and could even be null. > Once Method::resolve_jmethod_id(method_id) returns to the caller, how can the class be unreferenced? @dcubed-ojdk because the caller does not hold a reference to it initially, and resolving the jMethodID does not create a new reference to it. So once we have resolved, at the next safepoint the class could be seen as "not alive" and at the safepoint after that it can be unloaded. Hence we are relying on there being no safepoint checks in the upcall code, after the resolution, to ensure this can't happen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-2998571489 From wenanjian at openjdk.org Tue Jun 24 02:49:29 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Tue, 24 Jun 2025 02:49:29 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false [v2] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 06:34:06 GMT, Feilong Jiang wrote: >> Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: >> >> remove useless func multiply_32_x_32_loop which only used in AvoidUnalignedAccesses case! > > Looks good, thanks! @feilongjiang @RealFYang Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25923#issuecomment-2998585660 From duke at openjdk.org Tue Jun 24 02:49:30 2025 From: duke at openjdk.org (duke) Date: Tue, 24 Jun 2025 02:49:30 GMT Subject: RFR: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false [v2] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 03:28:20 GMT, Anjian Wen wrote: >> This disables BigInteger.multiplyToLen(), BigInteger.squareToLen(), BigInteger.montgomeryMultiply() >> and BigInteger.montgomerySquare() on linux-riscv64 platforms where misaligned memory accesses is slow. >> The reason is that these four BigInteger intrinsics do 8-byte misaligned memory accesses to int arrays >> under -XX:+UseCompactObjectHeaders. And this will have a negative impact on SPECJvm2008 crypto tests. >> >> Testing: >> - [x] Tier1-3 tests. >> - [x] SPECJvm2008 crypto performance tests. >> [score.txt](https://github.com/user-attachments/files/20846531/score.txt) > > Anjian Wen has updated the pull request incrementally with one additional commit since the last revision: > > remove useless func multiply_32_x_32_loop which only used in AvoidUnalignedAccesses case! @Anjian-Wen Your change (at version 17970631aa26341d5cb0e15203134653f8544ce4) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25923#issuecomment-2998587813 From dholmes at openjdk.org Tue Jun 24 03:01:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 03:01:31 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 10:39:59 GMT, Anton Artemov wrote: >> test/hotspot/jtreg/runtime/Monitor/StressWrapper_TestRecursiveLocking_36M.java line 36: >> >>> 34: * -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI >>> 35: * -Xint >>> 36: * -XX:LockingMode=0 >> >> I was wondering why these LockingMode=0 test cases were not setting `VerifyHeavyMonitors` instead, but I'm assuming the intent now is that we will only test that mode when it is set externally by the user (or in our case a particular test task definition)? >> >> I also realized we can only test heavy monitors in tests where we explicitly control the monitor creation places and hence can call the WB method to force inflation. That obviously reduces the test coverage for that mode quite significantly - but perhaps that will be handled if in the future we implicitly reenable forced inflation and do away with the WB usage. > > My understanding is that VerifyHeavyMonitors requires LockingMode = 0, see line 1852 of arguments.cpp. So one has to set both at the same time, not one instead of another. Now locking mode is hardcoded to lightweight, and there is no way to use the incompatible `VerifyHeavyMonitors` option. My understanding was that `VerifyHeavyMonitors` was to be used as a replacement for `LockingMode=0` aka `UseHeavyMonitors`. But as Coleen has requested all `VerifyHeavyMonitors` testing be removed this is now a moot point. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2162866307 From wenanjian at openjdk.org Tue Jun 24 03:11:34 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Tue, 24 Jun 2025 03:11:34 GMT Subject: Integrated: 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false In-Reply-To: References: Message-ID: On Sat, 21 Jun 2025 12:23:57 GMT, Anjian Wen wrote: > This disables BigInteger.multiplyToLen(), BigInteger.squareToLen(), BigInteger.montgomeryMultiply() > and BigInteger.montgomerySquare() on linux-riscv64 platforms where misaligned memory accesses is slow. > The reason is that these four BigInteger intrinsics do 8-byte misaligned memory accesses to int arrays > under -XX:+UseCompactObjectHeaders. And this will have a negative impact on SPECJvm2008 crypto tests. > > Testing: > - [x] Tier1-3 tests. > - [x] SPECJvm2008 crypto performance tests. > [score.txt](https://github.com/user-attachments/files/20846531/score.txt) This pull request has now been integrated. Changeset: 34412da5 Author: Anjian Wen Committer: Feilong Jiang URL: https://git.openjdk.org/jdk/commit/34412da52b41e9374168e67e3b6129576c8e4402 Stats: 132 lines in 3 files changed: 21 ins; 102 del; 9 mod 8360179: RISC-V: Only enable BigInteger intrinsics when AvoidUnalignedAccess == false Reviewed-by: fjiang, fyang ------------- PR: https://git.openjdk.org/jdk/pull/25923 From dholmes at openjdk.org Tue Jun 24 03:12:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 03:12:32 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: <8zcivFIhzh6QymXS119iTVwdzyufdXBaeOR4_dEjlig=.b1683573-13cb-4b40-baf3-8b609680e86f@github.com> On Tue, 17 Jun 2025 08:39:49 GMT, Anton Artemov wrote: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Changes are looking okay to me, but we have an issue with bug management that needs to be resolved - and probably need a new bug and PR. test/jtreg-ext/requires/VMProps.java line 424: > 422: * Note: Lightweight locking does not support RTM (for now). > 423: */ > 424: protected String vmRTMCompiler() { [JDK-8358542](https://bugs.openjdk.org/browse/JDK-8358542) exists to remove this so you would need to add that bug id to this PR. However, it seems the bug management for this has gotten completely messed up so you may need to scrap this PR and file a new bug and PR for this part. ------------- PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2952042310 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2162873188 From fyang at openjdk.org Tue Jun 24 03:24:38 2025 From: fyang at openjdk.org (Fei Yang) Date: Tue, 24 Jun 2025 03:24:38 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 19:26:11 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into 8358821-patch-verified-entry > - 2nd try at arm fix > - rename arm_with to guard_with > - arm32 fix > - s390 fix courtesy of Amit Kumar > - remove is_sigill_not_entrant > - more cleanup > - more TheRealMDoerr suggestions > - TheRealMDoerr suggestions > - remove trailing space > - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c Just FYI: My local tier1-3 test on linux-riscv64 is good. And I didn't witness an obvious change on specjbb performance with g1gc. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-2998661809 From iklam at openjdk.org Tue Jun 24 04:50:02 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 24 Jun 2025 04:50:02 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v11] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.042s][info][exceptions] Exception > [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) > [0.042s][info][exceptions,stacktrace] Exception > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) > [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 > Exception 1 caught. > > > - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. > > - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: > - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` > - By native code in Exceptions::special_exception() and and Exceptions::_throw()). > > **Concurrent Exceptions** > > Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @coleenp and @dholmes-ora comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/cd41e2ab..3055ddbb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=09-10 Stats: 44 lines in 1 file changed: 0 ins; 34 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From iklam at openjdk.org Tue Jun 24 04:50:02 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 24 Jun 2025 04:50:02 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v10] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Mon, 23 Jun 2025 12:26:13 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/exceptions.cpp line 637: >> >>> 635: if (!method->is_native() && (Bytecodes::Code) *method->bcp_from(bci) == Bytecodes::_athrow) { >>> 636: // TODO: it would be nice to filter out exceptions re-thrown by finally blocks (which include >>> 637: // try-with-resource statements): >> >> This mega-comment really doesn't belong here. Please ensure this discussion is in JBS and just use a short comment here e.g. >> >> // TODO: try to find a way to avoid repeated stacktraces when an exception gets re-thrown by a finally block > > Thank you - I made this comment to Ioi privately and would really like to see it removed. I removed the long comment and used David's suggestion instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2162950819 From lliu at openjdk.org Tue Jun 24 06:03:32 2025 From: lliu at openjdk.org (Liming Liu) Date: Tue, 24 Jun 2025 06:03:32 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32(C) on Ampere CPU and improve for short inputs [v4] In-Reply-To: References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: <5_ZwIhp41TM72YbqLgC452lYegSA4bwlc9zwIcyyN38=.1ffa2f57-86b2-4f42-ab01-70f2c6e33fb7@github.com> On Mon, 23 Jun 2025 05:54:32 GMT, Emanuel Peter wrote: >> src/hotspot/cpu/aarch64/globals_aarch64.hpp line 92: >> >>> 90: product(bool, UseCryptoPmullForCRC32, false, \ >>> 91: "Use Crypto PMULL instructions for CRC32 computation") \ >>> 92: product(uint, CryptoPmullForCRC32LowLimit, 256, DIAGNOSTIC, \ >> >> Can you please add a test that uses this flag, and sets it to some selected values, and maybe even a random value? > > Is there already an IR test that checks for the presence of the crypto pmull? That could be good to ensure it occurs as expected and only when expected :) There are test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java and TestCRC32C.java. It covers various lengths for the input, and test the intrinsics with the default values of the flag. It does not cover different values of the flag, which I think could be covered by VM_OPTIONS. I feel that it is not suitable to add the flag in the @run tag, since it is aarch64 specific while the test is generic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2163029715 From lliu at openjdk.org Tue Jun 24 06:08:31 2025 From: lliu at openjdk.org (Liming Liu) Date: Tue, 24 Jun 2025 06:08:31 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32(C) on Ampere CPU and improve for short inputs [v4] In-Reply-To: References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Mon, 23 Jun 2025 05:50:57 GMT, Emanuel Peter wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Add the message for the assertions > > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 126: > >> 124: warning("CryptoPmullForCRC32LowLimit must be a multiple of 128"); >> 125: CryptoPmullForCRC32LowLimit = align_down(CryptoPmullForCRC32LowLimit, 128); >> 126: } > > Can you describe somewhere why it has to be a multiple of `128`? Imagine someone comes across this later, and wonders if that is just some strange implementation limitation or something more fundamental, or something very subtle. For example, if the flag is 266 which is 128x2+10, then for 266 bytes of inputs, the code path is the same as if the flag is 256. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2163034891 From lliu at openjdk.org Tue Jun 24 06:32:32 2025 From: lliu at openjdk.org (Liming Liu) Date: Tue, 24 Jun 2025 06:32:32 GMT Subject: RFR: 8358032: Use crypto pmull for CRC32(C) on Ampere CPU and improve for short inputs [v4] In-Reply-To: References: <32uuLeizjdx7p5TeOzMvoyj0Smmra-DV4qhPZy7z-bE=.78485dd1-cc6f-4fbf-88a3-a4f78c164b0c@github.com> Message-ID: On Mon, 23 Jun 2025 05:50:57 GMT, Emanuel Peter wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Add the message for the assertions > > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 126: > >> 124: warning("CryptoPmullForCRC32LowLimit must be a multiple of 128"); >> 125: CryptoPmullForCRC32LowLimit = align_down(CryptoPmullForCRC32LowLimit, 128); >> 126: } > > Can you describe somewhere why it has to be a multiple of `128`? Imagine someone comes across this later, and wonders if that is just some strange implementation limitation or something more fundamental, or something very subtle. There are 4 kinds of loops labeled as CRC_by128_loop, CRC_by32_loop, CRC_by4_loop and CRC_by1_loop. If the flag is 266 which is 128x2+10, then for 265 bytes of inputs, there are 256 bytes that are handled by CRC_by32_loop, while for 266 bytes of inputs, the corresponding 256 bytes are handled by CRC_by128_loop, and I think this cases inconsistency. If CRC_by32_loop handles 256 bytes better than CRC_by128_loop on a platform, it should be used for 266 bytes as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2163069514 From ayang at openjdk.org Tue Jun 24 07:37:34 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 24 Jun 2025 07:37:34 GMT Subject: RFR: 8360220: Deprecate and obsolete ParallelRefProcBalancingEnabled In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 09:14:43 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcBalancingEnabled`, which is used only by Parallel and G1, and both have it enabled by default. > > Disabling it offers little benefit, so removing it do reduce the number commandline flags. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25932#issuecomment-2999161923 From ayang at openjdk.org Tue Jun 24 07:37:35 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 24 Jun 2025 07:37:35 GMT Subject: Integrated: 8360220: Deprecate and obsolete ParallelRefProcBalancingEnabled In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 09:14:43 GMT, Albert Mingkun Yang wrote: > Deprecating `ParallelRefProcBalancingEnabled`, which is used only by Parallel and G1, and both have it enabled by default. > > Disabling it offers little benefit, so removing it do reduce the number commandline flags. This pull request has now been integrated. Changeset: 54fec2b9 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/54fec2b98ba2197a588df37d805c3ad495fd0e61 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod 8360220: Deprecate and obsolete ParallelRefProcBalancingEnabled Reviewed-by: kbarrett, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/25932 From tschatzl at openjdk.org Tue Jun 24 08:39:31 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 24 Jun 2025 08:39:31 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 12:26:15 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Albert suggestions Some typos src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 410: > 408: // with respect to the heap max size as it's an upper bound (i.e., > 409: // we'll try to make the capacity smaller than it, not greater). > 410: maximum_desired_capacity = MAX2(maximum_desired_capacity, _g1h->min_capacity()); Suggestion: maximum_desired_capacity = MAX2(maximum_desired_capacity, _g1h->min_capacity()); src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 39: > 37: // For young collections, this heuristics is based on gc time ratio, i.e. trying > 38: // to change the heap so that current gc time ratio stays approximately as > 39: // selected by the user. Suggestion: // selected by the user. src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 44: > 42: // change. > 43: // > 44: // Short term tracking is based on the short-term gc time ratio i.e we count Suggestion: // Short term tracking is based on the short-term gc time ratio i.e we count src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 65: > 63: // Long term behavior is solely managed by regularly comparing actual long term gc > 64: // time ratio with the boundaries of above range in regular long term intervals. > 65: // If current long term gc time ratio is outside, expand or shrink respectively. Suggestion: // time ratio with the boundaries of above range in regular long term intervals. // If current long term gc time ratio is outside, expand or shrink respectively. src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 79: > 77: uint long_term_count_limit() const; > 78: // Number of times short-term gc time ratio crossed the lower or upper threshold > 79: // recently; every time the upper threshold is exceeded, it is incremented, and Suggestion: // recently; every time the upper threshold is exceeded, it is incremented, and ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25832#pullrequestreview-2952720651 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2163309395 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2163311368 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2163300891 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2163310414 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2163309782 From dholmes at openjdk.org Tue Jun 24 09:18:16 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 09:18:16 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:39:49 GMT, Anton Artemov wrote: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Bug management issue should be fixed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25847#issuecomment-2999459491 From duke at openjdk.org Tue Jun 24 09:18:16 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 09:18:16 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v2] In-Reply-To: References: Message-ID: <2evgySdW1pYxc1xd3rTGwfq0boMISr07cgPlzkFWTwo=.5d91ad21-c829-4080-ba57-20a92ab82bbf@github.com> > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: - 8359437: Addressed reviewers' comments - 8359437: Addressed reviewers' comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25847/files - new: https://git.openjdk.org/jdk/pull/25847/files/30de697d..215ee92d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=00-01 Stats: 39 lines in 8 files changed: 0 ins; 38 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25847/head:pull/25847 PR: https://git.openjdk.org/jdk/pull/25847 From dholmes at openjdk.org Tue Jun 24 09:18:16 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 09:18:16 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v2] In-Reply-To: <2evgySdW1pYxc1xd3rTGwfq0boMISr07cgPlzkFWTwo=.5d91ad21-c829-4080-ba57-20a92ab82bbf@github.com> References: <2evgySdW1pYxc1xd3rTGwfq0boMISr07cgPlzkFWTwo=.5d91ad21-c829-4080-ba57-20a92ab82bbf@github.com> Message-ID: On Tue, 24 Jun 2025 09:14:42 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: > > - 8359437: Addressed reviewers' comments > - 8359437: Addressed reviewers' comments Changes requested by dholmes (Reviewer). test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 80: > 78: monitors[index] = new Object(); > 79: synchronized (monitors[index]) { > 80: WB.forceInflateMonitorLockedObject(monitors[index]); This is now the only use of the new WB method and we can replace this with a simple: monitors[index].wait(1); as the `wait` forces inflation. Then we can deleted the new WB stuff. ------------- PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2952909815 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163410222 From duke at openjdk.org Tue Jun 24 09:18:17 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 09:18:17 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v2] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 15:48:41 GMT, Alan Bateman wrote: >> Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: >> >> - 8359437: Addressed reviewers' comments >> - 8359437: Addressed reviewers' comments > > test/jdk/java/lang/Thread/virtual/Parking.java line 388: > >> 386: @ParameterizedTest >> 387: @ValueSource(booleans = { true, false }) >> 388: @DisabledIf("LockingMode#isLegacy") > > Would you mind checking if the import DisabledIf can be removed from these tests? I think we only used it to conditionally run when not legacy mode. That imports became unused everywhere, so I removed them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163397316 From duke at openjdk.org Tue Jun 24 09:18:17 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 09:18:17 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v2] In-Reply-To: <8zcivFIhzh6QymXS119iTVwdzyufdXBaeOR4_dEjlig=.b1683573-13cb-4b40-baf3-8b609680e86f@github.com> References: <8zcivFIhzh6QymXS119iTVwdzyufdXBaeOR4_dEjlig=.b1683573-13cb-4b40-baf3-8b609680e86f@github.com> Message-ID: On Tue, 24 Jun 2025 03:09:06 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: >> >> - 8359437: Addressed reviewers' comments >> - 8359437: Addressed reviewers' comments > > test/jtreg-ext/requires/VMProps.java line 424: > >> 422: * Note: Lightweight locking does not support RTM (for now). >> 423: */ >> 424: protected String vmRTMCompiler() { > > [JDK-8358542](https://bugs.openjdk.org/browse/JDK-8358542) exists to remove this so you would need to add that bug id to this PR. However, it seems the bug management for this has gotten completely messed up so you may need to scrap this PR and file a new bug and PR for this part. Issue added. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163399542 From duke at openjdk.org Tue Jun 24 09:18:17 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 09:18:17 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v2] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 17:04:37 GMT, Leonid Mesnik wrote: >> Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: >> >> - 8359437: Addressed reviewers' comments >> - 8359437: Addressed reviewers' comments > > test/hotspot/jtreg/gtest/LockStackGtests.java line 26: > >> 24: >> 25: /* @test >> 26: * @summary Run LockStack gtests with LockingMode=2 > > All gtests are executed with default vm flags in GTestWrapper.java so this while test should be just removed. Addressed in the latest commits. > test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 37: > >> 35: * @bug 8318757 >> 36: * @summary Test concurrent monitor deflation by MonitorDeflationThread and thread dumping >> 37: * @library /test/lib / > > The '/' shouldn't be required for whitebox. Could you please remove it. Addressed in the latest commits. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163386476 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163386162 From dholmes at openjdk.org Tue Jun 24 09:52:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 09:52:32 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v11] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Tue, 24 Jun 2025 04:50:02 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @coleenp and @dholmes-ora comments Still good. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2953034428 From dholmes at openjdk.org Tue Jun 24 09:56:30 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 09:56:30 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v5] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 12:06:03 GMT, Coleen Phillimore wrote: >> This uses names for frame types for stackmaps in the verifier and redefinition. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/prims/jvmtiRedefineClasses.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/classfile/stackMapTable.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/classfile/stackMapTable.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> Thanks Coleen! ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25870#pullrequestreview-2953046279 From duke at openjdk.org Tue Jun 24 09:59:50 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 09:59:50 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v3] In-Reply-To: References: Message-ID: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8359437: Addressed reviewer's comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25847/files - new: https://git.openjdk.org/jdk/pull/25847/files/215ee92d..9a8d5191 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=01-02 Stats: 30 lines in 3 files changed: 6 ins; 21 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25847/head:pull/25847 PR: https://git.openjdk.org/jdk/pull/25847 From duke at openjdk.org Tue Jun 24 09:59:51 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 09:59:51 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v2] In-Reply-To: References: <2evgySdW1pYxc1xd3rTGwfq0boMISr07cgPlzkFWTwo=.5d91ad21-c829-4080-ba57-20a92ab82bbf@github.com> Message-ID: On Tue, 24 Jun 2025 09:14:37 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with two additional commits since the last revision: >> >> - 8359437: Addressed reviewers' comments >> - 8359437: Addressed reviewers' comments > > test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 80: > >> 78: monitors[index] = new Object(); >> 79: synchronized (monitors[index]) { >> 80: WB.forceInflateMonitorLockedObject(monitors[index]); > > This is now the only use of the new WB method and we can replace this with a simple: > > monitors[index].wait(1); > > as the `wait` forces inflation. Then we can deleted the new WB stuff. Good catch! It required adding try/catch to the test as `wait()` is throwing `InterruptedException`, addressed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163501835 From dholmes at openjdk.org Tue Jun 24 10:08:36 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 10:08:36 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v3] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 09:59:50 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Addressed reviewer's comments. test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 79: > 77: } > 78: > 79: static private void createMonitors() throws InterruptedException { I would have put the try/catch around the wait to minimise the number of changes. test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 86: > 84: monitors[index] = new Object(); > 85: synchronized (monitors[index]) { > 86: monitors[index].wait(1); Suggestion: // Force inflation monitors[index].wait(1); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163523429 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163521570 From haosun at openjdk.org Tue Jun 24 11:06:56 2025 From: haosun at openjdk.org (Hao Sun) Date: Tue, 24 Jun 2025 11:06:56 GMT Subject: RFR: JDK-8331859 : [PPC64] Remove support for Power7 and older [v17] In-Reply-To: References: Message-ID: On Fri, 23 May 2025 13:41:31 GMT, Suchismith Roy wrote: >> JBS Issue: [JDK-8331859](https://bugs.openjdk.org/browse/JDK-8331859) >> Linux PPC64le requires Power8 since the beginning. >> AIX requires Power8 with the new OpenXL based build ([JDK-8307520](https://bugs.openjdk.org/browse/JDK-8307520)). The old build has been removed in JDK 23 ([JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701)). >> Linux PPC64 Big Endian is no longer officially supported (only kept alive for development, debugging and testing purposes). >> >> The following checks for old processors are no longer needed: >> 8: VM_Version::has_lqarx() >> 7: VM_Version::has_popcntw() >> 6: VM_Version::has_cmpb() >> 5: VM_Version::has_popcntb() >> These ones and some more checks for old instructions are no longer needed. All code which is no longer reachable when removing them should also get removed. >> Checks like "PowerArchitecturePPC64 >= 8" (or older) can be removed. >> >> Atomic::PlatformCmpxchg<1>::operator() can be simplified by using sub-word instructions (lharx, lbarx). >> >> Temp registers can be removed from cmpxchgb and cmpxchgh. >> >> Build flags "-mcpu=powerpc64 -mtune=power5" for Big Endian linux should get replaced by "-mcpu=power8 -mtune=power8" as already used for linux PPC64le. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > switch case Hi, I came across VM crash with SIGILL after this patch. ## My local test environment As I don't hava ppc64 hardware, I built one `ppc64le` Java binary with **cross-compilation** on Ubuntu-24.04 and ran `java -version` with **QEMU**. Here shows a snippet of the error log: $ sudo chroot /sysroot/ppc64el /tmp/build-ppc64el/images/jdk/bin/java --version # # A fatal error has been detected by the Java Runtime Environment: # # SIGILL (0x4) at pc=0x00007718c7817380, pid=121270, tid=121272 # # JRE version: (26.0) (fastdebug build ) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-git-9c3eaa49f7f, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-ppc64le) # Problematic frame: # v ~BufferBlob::config_dscr 0x00007718c7817380 # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to //core.121270) # # An error report file with more information is saved as: # //hs_err_pid121270.log [2.177s][warning][os] Loading hsdis library failed # # qemu: uncaught target signal 6 (Aborted) - core dumped environment: line 1: 121269 Aborted "$@" ## The unsupported instruction is `mfdscr` I checked the instruction at the `pc=0x00007718c7817380`. It is `A6 02 63 7C mfspr r3, 3`. Hence I suspected if the following change in this PR might be related, i.e. `src/hotspot/cpu/ppc/vm_version_ppc.cpp` See https://github.com/openjdk/jdk/pull/20262/files#diff-409e1c95e0846f4dbf17f425616970846bbc5fa105c5fb7e80402dc4663416beL95 // Power 8: Configure Data Stream Control Register. config_dscr(); ## Check the support of `mfdscr` in my environment I built another Java binary with the code commit before this PR, i.e. `8357649: IGV: add block index to the supplemental node properties`. Then I tried the following commands to check the value of `PowerArchitecturePPC64` and the support of `mfdscr` instruction in my environment. As shown below, it seems that `mfdscr` is **NOT** supported by default in my qemu environment. 1. it's power9 which is >=8. it's as expected. $ sudo chroot /sysroot/ppc64el /tmp/build-ppc64el/images/jdk/bin/java -XX:+PrintFlagsFinal --version | grep PowerArchitecturePPC64 uintx PowerArchitecturePPC64 = 9 {ARCH diagnostic} {ergonomic} 2. there is no `mfdscr` $ sudo chroot /sysroot/ppc64el /tmp/build-ppc64el/images/jdk/bin/java -XX:+Verbose -version Version: ppc64 fsqrt isel lxarxeh cmpb popcntb popcntw fcfids vand lqarx aes vpmsumb vsx ldbrx stdbrx sha darn L1_data_cache_line_size=128 ContendedPaddingWidth 128 openjdk version "25-internal" 2025-09-16 OpenJDK Runtime Environment (fastdebug build 25-internal-git-unknown) OpenJDK 64-Bit Server VM (fastdebug build 25-internal-git-unknown, mixed mode) 3. `VM_Version::determine_features()` also treat `mfdscr` as illegal instructions Note that `a602 037c` is changed to `0000 0000` after decoding. $ sudo chroot /sysroot/ppc64el /tmp/build-ppc64el/images/jdk/bin/java -XX:+PrintAssembly --version OpenJDK 64-Bit Server VM warning: PrintAssembly is enabled; turning on DebugNonSafepoints to gain additional output Decoding cpu-feature detection stub at 0x00007964b385f380 before execution: [0.068s][warning][os] Loading hsdis library failed [MachCode] 0x00007964b385f380: 2c20 60fc | 2c20 60ec | 1e30 e57c | a920 e37c | f833 a77c | f400 a77c | f402 a77c | 9c26 60ec 0x00007964b385f3a0: 0404 0010 | 2922 c37c | 0815 0110 | 0814 0110 | a602 037c | 981e 007c | 2824 e37c | 2825 e37c 0x00007964b385f3c0: 82fe 0110 | e605 e17c | 3601 c57c | 2000 804e | ec1f 007c | 2000 804e [/MachCode] Decoding cpu-feature detection stub at 0x00007964b385f380 after execution: [MachCode] 0x00007964b385f380: 2c20 60fc | 2c20 60ec | 1e30 e57c | a920 e37c | f833 a77c | f400 a77c | f402 a77c | 9c26 60ec 0x00007964b385f3a0: 0404 0010 | 2922 c37c | 0815 0110 | 0814 0110 | 0000 0000 | 981e 007c | 2824 e37c | 2825 e37c 0x00007964b385f3c0: 82fe 0110 | e605 e17c | 0000 0000 | 2000 804e | ec1f 007c | 2000 804e [/MachCode] ## The question >From this PR and the follow-up patch (https://github.com/openjdk/jdk/pull/25495) by @TheRealMDoerr , it seems that `mfdscr` instruction should be **always available for Power>=8**. Unfortunately, it's actually not in my local environment. I'm not sure if it's an issue of QEMU or an issue of this PR? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20262#issuecomment-2999872960 From duke at openjdk.org Tue Jun 24 11:16:21 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 11:16:21 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v4] In-Reply-To: References: Message-ID: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8359437: Addressed reviewer's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25847/files - new: https://git.openjdk.org/jdk/pull/25847/files/9a8d5191..17f3b2d4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=02-03 Stats: 14 lines in 1 file changed: 5 ins; 6 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25847/head:pull/25847 PR: https://git.openjdk.org/jdk/pull/25847 From duke at openjdk.org Tue Jun 24 11:16:23 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 11:16:23 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v3] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 10:05:36 GMT, David Holmes wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8359437: Addressed reviewer's comments. > > test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 79: > >> 77: } >> 78: >> 79: static private void createMonitors() throws InterruptedException { > > I would have put the try/catch around the wait to minimise the number of changes. Makes sense! Addressed in the latest commit. > test/hotspot/jtreg/runtime/Monitor/ConcurrentDeflation.java line 86: > >> 84: monitors[index] = new Object(); >> 85: synchronized (monitors[index]) { >> 86: monitors[index].wait(1); > > Suggestion: > > // Force inflation > monitors[index].wait(1); Addressed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163681754 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163681481 From dholmes at openjdk.org Tue Jun 24 12:00:38 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 24 Jun 2025 12:00:38 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v4] In-Reply-To: References: Message-ID: <0zbpdvXTBgLoPsgCBLfAoCNRshJJElA12NIN5hHuF1Y=.512fdbfd-cc1d-4137-8e17-22b438de7eef@github.com> On Tue, 24 Jun 2025 11:16:21 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Addressed reviewer's comments LGTM! Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2953475858 From coleenp at openjdk.org Tue Jun 24 12:33:36 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 12:33:36 GMT Subject: RFR: 8359920: Use names for frame types in stackmaps [v5] In-Reply-To: References: Message-ID: <51dGYjMxe6B2AipdwdKDhFCKon3SczezZDtGQGJWPVY=.c798feac-da3f-4cb8-bfb2-bdede9c65034@github.com> On Mon, 23 Jun 2025 12:06:03 GMT, Coleen Phillimore wrote: >> This uses names for frame types for stackmaps in the verifier and redefinition. >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with three additional commits since the last revision: > > - Update src/hotspot/share/prims/jvmtiRedefineClasses.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/classfile/stackMapTable.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/classfile/stackMapTable.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> I think this came out really well. Thanks for your suggestions, David. Thanks for reviewing Matias, Serguei and Johan. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25870#issuecomment-3000213571 From coleenp at openjdk.org Tue Jun 24 12:33:36 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 12:33:36 GMT Subject: Integrated: 8359920: Use names for frame types in stackmaps In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 12:13:38 GMT, Coleen Phillimore wrote: > This uses names for frame types for stackmaps in the verifier and redefinition. > Tested with tier1-7. This pull request has now been integrated. Changeset: 28e96e33 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/28e96e333b61dfe60a84a48ff59bdf10c529f8be Stats: 41 lines in 3 files changed: 18 ins; 0 del; 23 mod 8359920: Use names for frame types in stackmaps Reviewed-by: dholmes, jsjolen, matsaave, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/25870 From alanb at openjdk.org Tue Jun 24 12:47:35 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 24 Jun 2025 12:47:35 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v4] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 11:16:21 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Addressed reviewer's comments test/hotspot/jtreg/serviceability/jvmti/vthread/StopThreadTest/StopThreadTest.java line 280: > 278: > 279: static boolean preemptableVirtualThread() { > 280: return is_virtual && !isBoundVThread; I think this is the last usage of ManagementFactory and HotSpotDiagnosticMXBean in this test so the imports can be expunged. test/jdk/jdk/internal/vm/Continuation/Basic.java line 426: > 424: return ManagementFactory.getPlatformMXBean(HotSpotDiagnosticMXBean.class) > 425: .getVMOption("LockingMode").getValue().equals("1"); > 426: } Likely the same here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163891432 PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163896486 From coleenp at openjdk.org Tue Jun 24 12:53:28 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 12:53:28 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! Hi Aleksey. The consensus around here is that this should go in JDK 25 so it gets the usual end of release testing, and doesn't go through a special process for update release. This fixes a regression where we have no workaround. If any disasters happen, we can change the BinarySearchThreshold in the source code as a fix, and/or make it a diagnostic option. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-3000349581 From shade at openjdk.org Tue Jun 24 13:04:28 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 24 Jun 2025 13:04:28 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! Yeah, as I said, can take it both ways. Picking this up to JDK 25 works for me as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-3000404473 From coleenp at openjdk.org Tue Jun 24 13:06:33 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 13:06:33 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: <_f9CuTsqV-WEq5F8dLzaSFckSe3YdyUGVVTOZWdnWXQ=.4008de9c-1812-4768-aaf8-a464ab100925@github.com> On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test jclass is a strong root so jclass will keep the Method alive until the jni local is released. If the method belongs to jclass. I think the whole worry about stale jmethodIDs is if there's native or JVMTI (also native) code that squirrels them away somewhere then tries to call JNI call Method or JVMTI with this old jmethodID value. Once we validate the method inside the VM, a safepoint cannot make the method go away if it's holder is alive. We might need to strengthen this by holding a class holder if we don't already. This is preexisting the change here. The jmethodID table is so that jmethodID isn't a stale pointer itself and doesn't require us to hold a stale pointer, but whether it can return a stale Method* (now and before this change) is something we should figure out how it should work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3000411580 From duke at openjdk.org Tue Jun 24 13:25:20 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 13:25:20 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v5] In-Reply-To: References: Message-ID: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8359437: Addressed reviewer's comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25847/files - new: https://git.openjdk.org/jdk/pull/25847/files/17f3b2d4..02565157 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=03-04 Stats: 6 lines in 2 files changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25847/head:pull/25847 PR: https://git.openjdk.org/jdk/pull/25847 From duke at openjdk.org Tue Jun 24 13:25:20 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 13:25:20 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v5] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 12:44:52 GMT, Alan Bateman wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8359437: Addressed reviewer's comment > > test/jdk/jdk/internal/vm/Continuation/Basic.java line 426: > >> 424: return ManagementFactory.getPlatformMXBean(HotSpotDiagnosticMXBean.class) >> 425: .getVMOption("LockingMode").getValue().equals("1"); >> 426: } > > Likely the same here. Correct, addressed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163993099 From duke at openjdk.org Tue Jun 24 13:25:20 2025 From: duke at openjdk.org (Anton Artemov) Date: Tue, 24 Jun 2025 13:25:20 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v4] In-Reply-To: References: Message-ID: <-UvM8zh4XOerhMCMCBPHzOvWu7yrRX4F90ZrM0uHT5o=.3f96c2c7-29b2-4670-bbe5-c1e82932d22c@github.com> On Tue, 24 Jun 2025 12:43:36 GMT, Alan Bateman wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8359437: Addressed reviewer's comments > > test/hotspot/jtreg/serviceability/jvmti/vthread/StopThreadTest/StopThreadTest.java line 280: > >> 278: >> 279: static boolean preemptableVirtualThread() { >> 280: return is_virtual && !isBoundVThread; > > I think this is the last usage of ManagementFactory and HotSpotDiagnosticMXBean in this test so the imports can be expunged. Correct, addressed in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2163992589 From coleenp at openjdk.org Tue Jun 24 13:55:29 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 13:55:29 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: <8820nqtQq82MXXPZpDkQWttOVzUFqgzNYoSQVMNpRP4=.44be289d-9053-4720-93f2-23a91350d444@github.com> On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! Thank you for your feedback and comments for this. Here goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-3000595520 From eosterlund at openjdk.org Tue Jun 24 14:48:33 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 24 Jun 2025 14:48:33 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test This looks good to me. Thanks for building the table! ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2954192642 From eosterlund at openjdk.org Tue Jun 24 14:48:34 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 24 Jun 2025 14:48:34 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 02:36:14 GMT, David Holmes wrote: > the `cls` parameter is never actually used. So while it is supposed to refer to the class you have the static method jMethodID for, there is no requirement that it actually does, and could even be null. Not passing in the cls parameter, would be a clear user error though, right? And one that would have crashed before, because if you racingly execute bytecodes of a class that is being unloaded, things would blow up one way or another. To me it seems like the user should just pass in the class as intended and then all is good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3000808219 From aboldtch at openjdk.org Tue Jun 24 14:48:34 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 24 Jun 2025 14:48:34 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Looks good. Thanks. _Feels like `JmethodIdCreation_lock` might not be needed now but it is also easier to reason about things when we serialize the modifications._ ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2954200289 From coleenp at openjdk.org Tue Jun 24 15:02:30 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 15:02:30 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! oops I guess I need approval, even though it was a clean backport. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-3000854295 From shade at openjdk.org Tue Jun 24 15:08:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 24 Jun 2025 15:08:29 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: <6zTZj2C_IgbPbPaZkkvNLVjqb_cyRXTky6NZjxJYoG4=.9b083646-d8a5-49ea-a539-efeaeb725ea1@github.com> On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25877#pullrequestreview-2954275829 From iklam at openjdk.org Tue Jun 24 15:08:29 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 24 Jun 2025 15:08:29 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25877#pullrequestreview-2954281986 From mdoerr at openjdk.org Tue Jun 24 15:24:43 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 24 Jun 2025 15:24:43 GMT Subject: RFR: 8360405: [PPC64] some environments don't support mfdscr instruction Message-ID: See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 ------------- Commit messages: - 8360405: [PPC64] some environments don't support mfdscr instruction Changes: https://git.openjdk.org/jdk/pull/25953/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25953&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360405 Stats: 70 lines in 4 files changed: 32 ins; 0 del; 38 mod Patch: https://git.openjdk.org/jdk/pull/25953.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25953/head:pull/25953 PR: https://git.openjdk.org/jdk/pull/25953 From mdoerr at openjdk.org Tue Jun 24 15:24:59 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 24 Jun 2025 15:24:59 GMT Subject: RFR: JDK-8331859 : [PPC64] Remove support for Power7 and older [v17] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 11:03:37 GMT, Hao Sun wrote: >> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: >> >> switch case > > Hi, I came across VM crash with SIGILL after this patch. > > ## My local test environment > > As I don't hava ppc64 hardware, I built one `ppc64le` Java binary with **cross-compilation** on Ubuntu-24.04 and ran `java -version` with **QEMU**. Here shows a snippet of the error log: > > > $ sudo chroot /sysroot/ppc64el /tmp/build-ppc64el/images/jdk/bin/java --version > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGILL (0x4) at pc=0x00007718c7817380, pid=121270, tid=121272 > # > # JRE version: (26.0) (fastdebug build ) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-git-9c3eaa49f7f, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-ppc64le) > # Problematic frame: > # v ~BufferBlob::config_dscr 0x00007718c7817380 > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to //core.121270) > # > # An error report file with more information is saved as: > # //hs_err_pid121270.log > [2.177s][warning][os] Loading hsdis library failed > # > # > qemu: uncaught target signal 6 (Aborted) - core dumped > environment: line 1: 121269 Aborted "$@" > > > ## The unsupported instruction is `mfdscr` > > I checked the instruction at the `pc=0x00007718c7817380`. It is `A6 02 63 7C mfspr r3, 3`. > Hence I suspected if the following change in this PR might be related, i.e. `src/hotspot/cpu/ppc/vm_version_ppc.cpp` > See https://github.com/openjdk/jdk/pull/20262/files#diff-409e1c95e0846f4dbf17f425616970846bbc5fa105c5fb7e80402dc4663416beL95 > > > // Power 8: Configure Data Stream Control Register. > config_dscr(); > > > ## Check the support of `mfdscr` in my environment > > I built another Java binary with the code commit before this PR, i.e. `8357649: IGV: add block index to the supplemental node properties`. > > Then I tried the following commands to check the value of `PowerArchitecturePPC64` and the support of `mfdscr` instruction in my environment. As shown below, it seems that `mfdscr` is **NOT** supported by default in my qemu environment. > > 1. it's power9 which is >=8. it's as expected. > > > > $ sudo chroot /sysroot/ppc64el /tmp/build-ppc64el/images/jdk/bin/java -XX:+PrintFlagsFinal --version | grep PowerArchitecturePPC64 > uintx PowerArchitecturePPC64 = 9 {ARCH diagnostic} {ergonomic} > > > 2. there is no `mfdscr` > > > $ sudo chroot /sysroot/ppc64el /tmp/build-ppc64el/images/jdk/bin/java -XX:+Verbose ... @shqking: Thanks for the detailed analysis! Not sure if it is a QEMU bug. If this is the only problem, let's just add the check back. See new PR linked above. Can you verify and review it, please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20262#issuecomment-3000944124 From mablakatov at openjdk.org Tue Jun 24 15:43:14 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Tue, 24 Jun 2025 15:43:14 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method Message-ID: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. This has passed tier1-3 and jcstress testing on AArch64. ------------- Commit messages: - 8359359: AArch64: share trampolines between static calls to the same method Changes: https://git.openjdk.org/jdk/pull/25954/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25954&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359359 Stats: 486 lines in 10 files changed: 346 ins; 110 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/25954.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25954/head:pull/25954 PR: https://git.openjdk.org/jdk/pull/25954 From tschatzl at openjdk.org Tue Jun 24 15:52:32 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 24 Jun 2025 15:52:32 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4] In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 12:26:15 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Albert suggestions src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 231: > 229: const double min_gc_time_ratio_ratio = G1MinimumPercentOfGCTimeRatio / 100.0; > 230: double upper_threshold = pause_time_threshold * (1 + min_gc_time_ratio_ratio); > 231: double lower_threshold = pause_time_threshold * (1 - min_gc_time_ratio_ratio); There are some inconsistencies naming variables in this change: we use `pause_time_threshold`, `pause_time_ratio` and other similar terms that are under(de)fined for, essentially, (parts of) the cpu resources that G1 uses. Also some things are called "ratio", but are percentages (e.g. `G1MinimumPercentOfGCTimeRatio`), and the comments somewhat interchangeably use cpu usage (in percent) and gctimeratio (the inverse of cpu usage). I think it is useful to clean this up, also looking forward to making G1 start using actual GC CPU usage (thread user times), e.g. pause_time_threshold -> gc_cpu_usage_threshold long_term_pause_time_ratio -> long_term_cpu_usage [and so on] I admit that right now `cpu_usage` might be slightly misleading because we only approximate it using pause times and total run times (which do not necessarily reflect actual cpu usage), but it seems good enough for now, and future changes (https://bugs.openjdk.org/browse/JDK-8359348; even with that we might need to fallback to current mechanism). What do others think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2164372113 From dlong at openjdk.org Tue Jun 24 16:16:36 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 24 Jun 2025 16:16:36 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: <2hLKCRKzNs19ZW_ntM7yJ2ynW0Hj7SwBrN9hlcOSxTM=.555bb43a-8fb4-4157-9cdb-a18b28178932@github.com> On Tue, 24 Jun 2025 03:21:39 GMT, Fei Yang wrote: >> Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: >> >> - Merge branch 'master' into 8358821-patch-verified-entry >> - 2nd try at arm fix >> - rename arm_with to guard_with >> - arm32 fix >> - s390 fix courtesy of Amit Kumar >> - remove is_sigill_not_entrant >> - more cleanup >> - more TheRealMDoerr suggestions >> - TheRealMDoerr suggestions >> - remove trailing space >> - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c > > Just FYI: My local tier1-3 test on linux-riscv64 is good. And I didn't witness an obvious change on specjbb performance with g1gc. Thanks @RealFYang. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-3001102616 From eosterlund at openjdk.org Tue Jun 24 16:43:35 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 24 Jun 2025 16:43:35 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 19:26:11 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into 8358821-patch-verified-entry > - 2nd try at arm fix > - rename arm_with to guard_with > - arm32 fix > - s390 fix courtesy of Amit Kumar > - remove is_sigill_not_entrant > - more cleanup > - more TheRealMDoerr suggestions > - TheRealMDoerr suggestions > - remove trailing space > - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25764#pullrequestreview-2954594316 From eosterlund at openjdk.org Tue Jun 24 16:43:35 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 24 Jun 2025 16:43:35 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v4] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 19:10:03 GMT, Dean Long wrote: > I think it's OK if there is a race to have a point of no return, and if one thread gets there first, it wins, and we don't need to check again. It's tempting to want to do an extra check when we disarm under the lock, but then it would need a comment explaining why we do it, even though the make_not_entrant could come in right after and we would miss it. And we have already done the work of healing the oops by this point. Finally, I like the encapsulation that only nmethod_stub_entry_barrier needs to know about not_entrant, and nmethod_entry_barrier doesn't need to know. Fair enough! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2164467913 From coleenp at openjdk.org Tue Jun 24 17:12:41 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 17:12:41 GMT Subject: [jdk25] Integrated: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! This pull request has now been integrated. Changeset: 0694cc1d Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/0694cc1d527db17f2e5cdd4f9d2489ba04adfef5 Stats: 924 lines in 18 files changed: 854 ins; 20 del; 50 mod 8352075: Perf regression accessing fields Reviewed-by: shade, iklam Backport-of: e18277b470a162b9668297e8e286c812c4b0b604 ------------- PR: https://git.openjdk.org/jdk/pull/25877 From coleenp at openjdk.org Tue Jun 24 17:12:39 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 17:12:39 GMT Subject: [jdk25] RFR: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 14:44:54 GMT, Coleen Phillimore wrote: > Hi all, > > This pull request contains a backport of commit [e18277b4](https://github.com/openjdk/jdk/commit/e18277b470a162b9668297e8e286c812c4b0b604) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Radim Vansa on 12 Jun 2025 and was reviewed by Coleen Phillimore, Ioi Lam and Johan Sj?len. > > This has been running cleanly in CI for a week now. > > Thanks! Thank you Aleksey and Ioi. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25877#issuecomment-3001247661 From coleenp at openjdk.org Tue Jun 24 17:17:33 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 17:17:33 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Thank you for reviewing from the GC and class unloading perspective, Axel and Erik. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3001264941 From iwalulya at openjdk.org Tue Jun 24 17:38:31 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 24 Jun 2025 17:38:31 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4] In-Reply-To: References: Message-ID: <76uGVn8M29SoJxIA5AwwfYB13Gb0w8VT-zY5VTn8ymc=.3e1e2bd3-59b6-4d18-925e-bb53561f1cdd@github.com> On Tue, 24 Jun 2025 15:49:56 GMT, Thomas Schatzl wrote: >> Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: >> >> Albert suggestions > > src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 231: > >> 229: const double min_gc_time_ratio_ratio = G1MinimumPercentOfGCTimeRatio / 100.0; >> 230: double upper_threshold = pause_time_threshold * (1 + min_gc_time_ratio_ratio); >> 231: double lower_threshold = pause_time_threshold * (1 - min_gc_time_ratio_ratio); > > There are some inconsistencies naming variables in this change: we use `pause_time_threshold`, `pause_time_ratio` and other similar terms that are under(de)fined for, essentially, (parts of) the cpu resources that G1 uses. > > Also some things are called "ratio", but are percentages (e.g. `G1MinimumPercentOfGCTimeRatio`), and the comments somewhat interchangeably use cpu usage (in percent) and gctimeratio (the inverse of cpu usage). > > I think it is useful to clean this up, also looking forward to making G1 start using actual GC CPU usage (thread user times), e.g. > > > pause_time_threshold -> gc_cpu_usage_threshold > long_term_pause_time_ratio -> long_term_cpu_usage > [and so on] > > > I admit that right now `cpu_usage` might be slightly misleading because we only approximate it using pause times and total run times (which do not necessarily reflect actual cpu usage), but it seems good enough for now, and future changes (https://bugs.openjdk.org/browse/JDK-8359348; even with that we might need to fallback to current mechanism). > > What do others think? Sounds good to me, I may suggest naming it as a `gc_cpu_usage_target` or `target_gc_cpu_usage` then it is easier to reason about the deviations from the target in the model. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2164557309 From sspitsyn at openjdk.org Tue Jun 24 18:02:31 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 24 Jun 2025 18:02:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2954810575 From sparasa at openjdk.org Tue Jun 24 18:22:28 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Tue, 24 Jun 2025 18:22:28 GMT Subject: RFR: 8359965: Enable paired pushp and popp instruction usage for APX enabled CPUs In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 06:39:52 GMT, David Holmes wrote: > Just a drive-by comment as this isn't code I normally have much to do with but to me it would look a lot cleaner to define `push_paired`/`pop_paired` (maybe abbreviating directly to `pushp`/`popp`?) rather than passing the boolean. Hi David (@dholmes-ora), Thanks for the suggestion! We're open to changes in the API as suggested by the community. The users need to be aware that `push_paired`/`pop_paired` or `pushp`/`popp` will fallback to the legacy push/pop instructions if the CPU does not support APX features. Thanks, Vamsi ------------- PR Comment: https://git.openjdk.org/jdk/pull/25889#issuecomment-3001458932 From mdoerr at openjdk.org Tue Jun 24 18:44:45 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 24 Jun 2025 18:44:45 GMT Subject: RFR: 8360405: [PPC64] some environments don't support mfdscr instruction [v2] In-Reply-To: References: Message-ID: > See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Add comment. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25953/files - new: https://git.openjdk.org/jdk/pull/25953/files/48d7cc95..d7f1c1d8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25953&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25953&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25953.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25953/head:pull/25953 PR: https://git.openjdk.org/jdk/pull/25953 From mdoerr at openjdk.org Tue Jun 24 18:53:17 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 24 Jun 2025 18:53:17 GMT Subject: RFR: 8360405: [PPC64] some environments don't support mfdscr instruction [v3] In-Reply-To: References: Message-ID: > See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Newline. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25953/files - new: https://git.openjdk.org/jdk/pull/25953/files/d7f1c1d8..7394dd28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25953&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25953&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25953.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25953/head:pull/25953 PR: https://git.openjdk.org/jdk/pull/25953 From coleenp at openjdk.org Tue Jun 24 21:35:44 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 24 Jun 2025 21:35:44 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: <-CyJHR3KLJKebYK5HMcpQkY5AoRsPTrI6U4iu0CL3_Q=.d5689e6c-aefa-422a-9e66-75a8d080c696@github.com> On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Thank you Serguei and Dan for reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3001955294 From xpeng at openjdk.org Tue Jun 24 21:43:50 2025 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 24 Jun 2025 21:43:50 GMT Subject: RFR: 8354555: Add generic JFR events for TaskTerminator [v7] In-Reply-To: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> References: <_7FP2wNe8p3N8SxKdmCN1x4zKO8TT5JWRcWEt51i35c=.4fbac292-3cb7-48b9-922e-1114f74e0549@github.com> Message-ID: > The purpose of the PR is to add generic JFR events for TaskTerminator to track the attempts and timings that GC threads have tried to terminate GC tasks. > > Today only G1 emits JFR event with name `Termination` from [G1ParEvacuateFollowersClosure](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1YoungCollector.cpp#L555-L563), all other garbage collectors don't emit any JFR event for the termination attempt at all. > > By adding this, it gives performance engineers the visibility to the termination attempts and termination time when GC threads trying to finish GC tasks, we could build tool to analyze the jfr events to determine if there is potential data structure issue in application code, e.g. very large LinkedList or LinkedBlockingQueue. > > For the test, I have manually tested different GCs with Flight Recording enabled and verified the events: > G1: > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0108 ms > gcId = 0 > gcWorkerId = 8 > name = "Termination" > eventThread = "GC Thread#4" (osThreadId = 20483) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0467 ms > gcId = 0 > gcWorkerId = 2 > name = "Termination" > eventThread = "GC Thread#2" (osThreadId = 21251) > } > > jdk.GCPhaseParallel { > startTime = 23:09:34.124 (2025-05-22) > duration = 0.0474 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination" > eventThread = "GC Thread#8" (osThreadId = 36359) > } > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000834 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > jdk.GCPhaseParallel { > startTime = 23:09:41.925 (2025-05-22) > duration = 0.000166 ms > gcId = 14 > gcWorkerId = 7 > name = "Termination: Parallel Marking" > eventThread = "GC Thread#1" (osThreadId = 21507) > } > > > Shenandoah: > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0202 ms > gcId = 0 > gcWorkerId = 0 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#3" (osThreadId = 13827) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0205 ms > gcId = 0 > gcWorkerId = 1 > name = "Termination: Concurrent Mark" > eventThread = "Shenandoah GC Threads#1" (osThreadId = 14339) > } > > jdk.GCPhaseParallel { > startTime = 23:39:58.890 (2025-05-22) > duration = 0.0127 ms > gcId = 0 > gcWorkerId = 5 > name = "Termination: Final Mark" > eventThread = "Shenandoah G... Xiaolong Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: - Merge branch 'openjdk:master' into JDK-8354555 - Merge branch 'openjdk:master' into JDK-8354555 - Merge branch 'openjdk:master' into JDK-8354555 - Fix jft test failure - Merge branch 'master' into JDK-8354555 - Patch to fix the PR concerns - Emit exact same events for G1 as G1 is emitting today from G1EvacuateRegionsBaseTask and G1STWRefProcProxyTask - Add include "workerThread.hpp" - Touch up - Move TERMINATION_EVENT_NAME_PREFIX_ASSERT to taskTerminator.cpp - ... and 21 more: https://git.openjdk.org/jdk/compare/ba0c1223...c6ff7070 ------------- Changes: https://git.openjdk.org/jdk/pull/24676/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24676&range=06 Stats: 90 lines in 10 files changed: 68 ins; 7 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/24676.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24676/head:pull/24676 PR: https://git.openjdk.org/jdk/pull/24676 From liach at openjdk.org Tue Jun 24 22:01:14 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 24 Jun 2025 22:01:14 GMT Subject: RFR: 8360163: Create annotations to mark dumping method handles and runtime setup required classes Message-ID: Currently, the list of classes that have interdependencies and those that need runtimeSetup are maintained in a hardcoded list in CDS. This makes it risky for core library developers as they might introduce new interdependencies and observe CDS to fail. By moving the mechanism of these lists to core library annotations as a first step, we can gradually expose the AOT contracts as program semantics described by internal annotations, and also helps us to explore how we can expose these functionalities to the public later. ------------- Commit messages: - Merge branch 'master' of https://github.com/openjdk/jdk into exp/cds-mh-anno - Name this AOTCI - Rename MHArchived to AotInitializable - Years - Can this fix? - Seems redundant - Runtime setup - Merge branch 'master' of https://github.com/openjdk/jdk into exp/cds-mh-anno - Fix - Try pass flag as annotation? Changes: https://git.openjdk.org/jdk/pull/25922/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25922&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360163 Stats: 454 lines in 39 files changed: 304 ins; 108 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/25922.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25922/head:pull/25922 PR: https://git.openjdk.org/jdk/pull/25922 From haosun at openjdk.org Wed Jun 25 00:11:28 2025 From: haosun at openjdk.org (Hao Sun) Date: Wed, 25 Jun 2025 00:11:28 GMT Subject: RFR: 8360405: [PPC64] some environments don't support mfdscr instruction [v3] In-Reply-To: References: Message-ID: <6lExDuYhWOezT9WGfdBj4IrKxpnaCJY5q44KdVZWKwI=.70b7e2b7-950f-4643-97fb-604f956a9d86@github.com> On Tue, 24 Jun 2025 18:53:17 GMT, Martin Doerr wrote: >> See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Newline. Thanks for your fix. It works in my local test environment(QEMU) now. ------------- Marked as reviewed by haosun (Committer). PR Review: https://git.openjdk.org/jdk/pull/25953#pullrequestreview-2955762254 From jrose at openjdk.org Wed Jun 25 00:36:30 2025 From: jrose at openjdk.org (John R Rose) Date: Wed, 25 Jun 2025 00:36:30 GMT Subject: RFR: 8360163: Create annotations to mark dumping method handles and runtime setup required classes In-Reply-To: References: Message-ID: On Sat, 21 Jun 2025 00:03:26 GMT, Chen Liang wrote: > Currently, the list of classes that have interdependencies and those that need runtimeSetup are maintained in a hardcoded list in CDS. This makes it risky for core library developers as they might introduce new interdependencies and observe CDS to fail. By moving the mechanism of these lists to core library annotations as a first step, we can gradually expose the AOT contracts as program semantics described by internal annotations, and also helps us to explore how we can expose these functionalities to the public later. Please update the comment in `aotArtifactFinder.hpp` to mention the annotation; it currently mentions the C++ accessor but should also mention the annotation that drives it. A class is AOT-init if EITHER it has the `@AOTClInit` annotation OR it is in the heap for other reasons. So it?s harmless to put `@AOTClInit` on those extra classes that need `runtimeSetup` calls. If you agree with that logic, then I recommend not having a separate `@RTS` anno as well. I don?t think that second anno pulls its weight. Just execute the `runtimeSetup` method in any annotated class, as part of the contract of that annotation. In the future we may also want an `assemblyCleanup` hook in the same potential places. src/java.base/share/classes/java/lang/invoke/VarHandles.java line 751: > 749: // System.out.println("import jdk.internal.vm.annotation.ForceInline;"); > 750: // System.out.println("import jdk.internal.vm.annotation.Hidden;"); > 751: // System.out.println("import jdk.internal.vm.annotation.MethodHandleArchived;"); this name needs to be adjusted in the comments ------------- Changes requested by jrose (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25922#pullrequestreview-2955770916 PR Review Comment: https://git.openjdk.org/jdk/pull/25922#discussion_r2165174485 From cjplummer at openjdk.org Wed Jun 25 00:40:42 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 25 Jun 2025 00:40:42 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread Message-ID: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). Testing (in progress): - [x] tier1 ci - [x] tier1 ci with -XX:StartFlightRecording - [ ] tier5 ci ------------- Commit messages: - revisit my ABC's - Make JfrRecorderThread known to SA Changes: https://git.openjdk.org/jdk/pull/25960/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25960&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360312 Stats: 25 lines in 5 files changed: 13 ins; 9 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25960.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25960/head:pull/25960 PR: https://git.openjdk.org/jdk/pull/25960 From kbarrett at openjdk.org Wed Jun 25 00:41:52 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 25 Jun 2025 00:41:52 GMT Subject: RFR: 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description Message-ID: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> Please review this change that renames Deferred<> to DeferredStatic<>, to better reflect the intended usage. (This involves renaming the source file.) This change also revises the documentation comment for the class to better describe the intended usage. In addition, there are a number of cleanups: (1) The include guard didn't get updated when the name was previously changed to Deferred. It's updated here to reflect the new name. (2) There were problems with the include block that are fixed here. (3) The changes from JDK-8359923 are backed out. They aren't useful with the intended usage model. (4) A gtest is added to test the class's functionality. Testing: mach5 tier1, including new gtest ------------- Commit messages: - gtest for DeferredStatic - update uses - update - rename deferred.hpp => deferredStatic.hpp Changes: https://git.openjdk.org/jdk/pull/25964/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25964&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360458 Stats: 306 lines in 7 files changed: 210 ins; 91 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25964.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25964/head:pull/25964 PR: https://git.openjdk.org/jdk/pull/25964 From dholmes at openjdk.org Wed Jun 25 02:38:31 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 02:38:31 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2956079608 From dholmes at openjdk.org Wed Jun 25 02:38:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 02:38:32 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 14:46:12 GMT, Erik ?sterlund wrote: > > the `cls` parameter is never actually used. So while it is supposed to refer to the class you have the static method jMethodID for, there is no requirement that it actually does, and could even be null. > > Not passing in the cls parameter, would be a clear user error though, right? And one that would have crashed before, because if you racingly execute bytecodes of a class that is being unloaded, things would blow up one way or another. To me it seems like the user should just pass in the class as intended and then all is good. The point is to try and make things more robust when the user does the unexpected. As Coleen stated we are trying to handle cases where JNI code looks up a jMethodID in one place, stashes it away and then uses it elsewhere with no guarantee the class is being kept alive. I agree you would think they would have, and pass in, the original jclass reference, but the fact we don't actually use that value is not hard to determine and JNI code can take advantage of that - treating the `jMethodId` as an effective raw pointer to a method. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3002494069 From dholmes at openjdk.org Wed Jun 25 02:38:32 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 02:38:32 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: <_f9CuTsqV-WEq5F8dLzaSFckSe3YdyUGVVTOZWdnWXQ=.4008de9c-1812-4768-aaf8-a464ab100925@github.com> References: <_f9CuTsqV-WEq5F8dLzaSFckSe3YdyUGVVTOZWdnWXQ=.4008de9c-1812-4768-aaf8-a464ab100925@github.com> Message-ID: On Tue, 24 Jun 2025 13:04:19 GMT, Coleen Phillimore wrote: > The jmethodID table is so that jmethodID isn't a stale pointer itself and doesn't require us to hold a stale pointer, but whether it can return a stale Method* (now and before this change) is something we should figure out how it should work. That's fine. I (per my response to Thomas) thought the new approach also closed the door on unsafe usage of the `jMethodID`, but that is not the case. It probably does close the door quite a bit in relation to current approach though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3002503846 From dholmes at openjdk.org Wed Jun 25 02:51:36 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 02:51:36 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v5] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 13:25:20 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Addressed reviewer's comment Still good. Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2956124578 PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2956125925 From dholmes at openjdk.org Wed Jun 25 04:50:27 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 04:50:27 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread In-Reply-To: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: <-vXX2eYTSYa1uGoZpVKeI3co2ygb5tvf133u5u7lFCA=.d668d80a-f55d-4d32-b205-ed64e9a94c1a@github.com> On Tue, 24 Jun 2025 21:15:06 GMT, Chris Plummer wrote: > Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). > > I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). > > Testing (in progress): > > - [x] tier1 ci > - [x] tier1 ci with -XX:StartFlightRecording > - [ ] tier5 ci Would a simple forward declaration of `class JfrRecordThread` in vmStructs.cpp avoid the need to move the class definition to the header file? Otherwise fix looks good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25960#pullrequestreview-2956528984 From sspitsyn at openjdk.org Wed Jun 25 05:52:29 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 25 Jun 2025 05:52:29 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread In-Reply-To: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: On Tue, 24 Jun 2025 21:15:06 GMT, Chris Plummer wrote: > Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). > > I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). > > Testing (in progress): > > - [x] tier1 ci > - [x] tier1 ci with -XX:StartFlightRecording > - [ ] tier5 ci This looks good except the copyright header update in last file. test/hotspot/jtreg/serviceability/sa/ClhsdbJstackWithConcurrentLock.java line 2: > 1: /* > 2: * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. This seems to be an incorrect way to update the copyright header. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25960#pullrequestreview-2956718918 PR Review Comment: https://git.openjdk.org/jdk/pull/25960#discussion_r2165838975 From cjplummer at openjdk.org Wed Jun 25 06:34:10 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 25 Jun 2025 06:34:10 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v2] In-Reply-To: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: > Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). > > I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). > > Testing (in progress): > > - [x] tier1 ci > - [x] tier1 ci with -XX:StartFlightRecording > - [x] tier5 ci Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: fix copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25960/files - new: https://git.openjdk.org/jdk/pull/25960/files/a679b236..28a97a11 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25960&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25960&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25960.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25960/head:pull/25960 PR: https://git.openjdk.org/jdk/pull/25960 From cjplummer at openjdk.org Wed Jun 25 06:34:10 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 25 Jun 2025 06:34:10 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v2] In-Reply-To: <-vXX2eYTSYa1uGoZpVKeI3co2ygb5tvf133u5u7lFCA=.d668d80a-f55d-4d32-b205-ed64e9a94c1a@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> <-vXX2eYTSYa1uGoZpVKeI3co2ygb5tvf133u5u7lFCA=.d668d80a-f55d-4d32-b205-ed64e9a94c1a@github.com> Message-ID: <8e1B3OlCE23T-_4_mHcJ6GIxpAAv0NNTDx8MSkZMxlA=.1046aa86-0a10-4abc-8d0a-7f1374e283a5@github.com> On Wed, 25 Jun 2025 04:48:07 GMT, David Holmes wrote: > Would a simple forward declaration of `class JfrRecordThread` in vmStructs.cpp avoid the need to move the class definition to the header file? vmStructs does a sizeof(JfrRecordThread) so it needs the full class definition, not just a declaration. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25960#issuecomment-3003507738 From cjplummer at openjdk.org Wed Jun 25 06:34:10 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 25 Jun 2025 06:34:10 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v2] In-Reply-To: References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: On Wed, 25 Jun 2025 05:48:24 GMT, Serguei Spitsyn wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> fix copyright > > test/hotspot/jtreg/serviceability/sa/ClhsdbJstackWithConcurrentLock.java line 2: > >> 1: /* >> 2: * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. > > This seems to be an incorrect way to update the copyright header. Yes, I will fix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25960#discussion_r2165892011 From qxing at openjdk.org Wed Jun 25 06:51:42 2025 From: qxing at openjdk.org (Qizheng Xing) Date: Wed, 25 Jun 2025 06:51:42 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers Message-ID: Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. ------------- Commit messages: - Fix trailing whirespace - Add missing include guards Changes: https://git.openjdk.org/jdk/pull/25968/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25968&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360474 Stats: 21 lines in 4 files changed: 19 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25968.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25968/head:pull/25968 PR: https://git.openjdk.org/jdk/pull/25968 From dholmes at openjdk.org Wed Jun 25 06:52:34 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 06:52:34 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v2] In-Reply-To: References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: On Wed, 25 Jun 2025 06:34:10 GMT, Chris Plummer wrote: >> Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). >> >> I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). >> >> Testing (in progress): >> >> - [x] tier1 ci >> - [x] tier1 ci with -XX:StartFlightRecording >> - [x] tier5 ci > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > fix copyright Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25960#pullrequestreview-2956862618 From iveresov at openjdk.org Wed Jun 25 07:02:05 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 25 Jun 2025 07:02:05 GMT Subject: [jdk25] RFR: 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded Message-ID: 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded ------------- Commit messages: - Backport 5c4f92ba9a2b820fa12920400c9037b5d3c37aa4 Changes: https://git.openjdk.org/jdk/pull/25969/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25969&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359788 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25969.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25969/head:pull/25969 PR: https://git.openjdk.org/jdk/pull/25969 From dholmes at openjdk.org Wed Jun 25 07:12:28 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 07:12:28 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 06:44:31 GMT, Qizheng Xing wrote: > Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. That looks fine to me. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25968#pullrequestreview-2956923970 From mhaessig at openjdk.org Wed Jun 25 07:25:28 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 25 Jun 2025 07:25:28 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 06:44:31 GMT, Qizheng Xing wrote: > Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. Good catch! Thank you for fixing this. Looks good to me! ------------- Marked as reviewed by mhaessig (Committer). PR Review: https://git.openjdk.org/jdk/pull/25968#pullrequestreview-2956965167 From jsikstro at openjdk.org Wed Jun 25 07:33:28 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 25 Jun 2025 07:33:28 GMT Subject: RFR: 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description In-Reply-To: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> References: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> Message-ID: On Wed, 25 Jun 2025 00:36:44 GMT, Kim Barrett wrote: > Please review this change that renames Deferred<> to DeferredStatic<>, to > better reflect the intended usage. (This involves renaming the source file.) > This change also revises the documentation comment for the class to better > describe the intended usage. > > In addition, there are a number of cleanups: > > (1) The include guard didn't get updated when the name was previously changed > to Deferred. It's updated here to reflect the new name. > > (2) There were problems with the include block that are fixed here. > > (3) The changes from JDK-8359923 are backed out. They aren't useful with the > intended usage model. > > (4) A gtest is added to test the class's functionality. > > Testing: mach5 tier1, including new gtest Thank you for this Kim! As the author of JDK-8359923, which is backed out here, I think this patch makes the intended use case of (now) DeferredStatic very clear. test/hotspot/gtest/utilities/test_deferredStatic.cpp line 31: > 29: #include > 30: > 31: #include "unittest.hpp" I see you've discussed the order of this include in https://github.com/openjdk/jdk/pull/25927#discussion_r2164307139 already. I agree with David that we should document an agreed upon style. I don't have anything against this style. ------------- Marked as reviewed by jsikstro (Committer). PR Review: https://git.openjdk.org/jdk/pull/25964#pullrequestreview-2956971731 PR Review Comment: https://git.openjdk.org/jdk/pull/25964#discussion_r2165996528 From jsjolen at openjdk.org Wed Jun 25 08:01:44 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 25 Jun 2025 08:01:44 GMT Subject: RFR: 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description In-Reply-To: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> References: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> Message-ID: On Wed, 25 Jun 2025 00:36:44 GMT, Kim Barrett wrote: > Please review this change that renames Deferred<> to DeferredStatic<>, to > better reflect the intended usage. (This involves renaming the source file.) > This change also revises the documentation comment for the class to better > describe the intended usage. > > In addition, there are a number of cleanups: > > (1) The include guard didn't get updated when the name was previously changed > to Deferred. It's updated here to reflect the new name. > > (2) There were problems with the include block that are fixed here. > > (3) The changes from JDK-8359923 are backed out. They aren't useful with the > intended usage model. > > (4) A gtest is added to test the class's functionality. > > Testing: mach5 tier1, including new gtest These changes look good to me, thank you. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25964#pullrequestreview-2957070347 From sspitsyn at openjdk.org Wed Jun 25 08:03:29 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 25 Jun 2025 08:03:29 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v2] In-Reply-To: References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: On Wed, 25 Jun 2025 06:34:10 GMT, Chris Plummer wrote: >> Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). >> >> I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). >> >> Testing (in progress): >> >> - [x] tier1 ci >> - [x] tier1 ci with -XX:StartFlightRecording >> - [x] tier5 ci > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > fix copyright Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25960#pullrequestreview-2957075591 From stefank at openjdk.org Wed Jun 25 08:06:29 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 25 Jun 2025 08:06:29 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 06:44:31 GMT, Qizheng Xing wrote: > Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. src/hotspot/os/aix/decoder_aix.hpp line 51: > 49: > 50: #endif // OS_AIX_DECODER_AIX_HPP > 51: Suggestion: #endif // OS_AIX_DECODER_AIX_HPP ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25968#discussion_r2166069377 From stefank at openjdk.org Wed Jun 25 08:06:30 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 25 Jun 2025 08:06:30 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 08:03:09 GMT, Stefan Karlsson wrote: >> Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. > > src/hotspot/os/aix/decoder_aix.hpp line 51: > >> 49: >> 50: #endif // OS_AIX_DECODER_AIX_HPP >> 51: > > Suggestion: > > #endif // OS_AIX_DECODER_AIX_HPP It looks like there's an extra blankline here (at least in the GitHub view) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25968#discussion_r2166070641 From stefank at openjdk.org Wed Jun 25 08:11:31 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 25 Jun 2025 08:11:31 GMT Subject: RFR: 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description In-Reply-To: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> References: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> Message-ID: On Wed, 25 Jun 2025 00:36:44 GMT, Kim Barrett wrote: > Please review this change that renames Deferred<> to DeferredStatic<>, to > better reflect the intended usage. (This involves renaming the source file.) > This change also revises the documentation comment for the class to better > describe the intended usage. > > In addition, there are a number of cleanups: > > (1) The include guard didn't get updated when the name was previously changed > to Deferred. It's updated here to reflect the new name. > > (2) There were problems with the include block that are fixed here. > > (3) The changes from JDK-8359923 are backed out. They aren't useful with the > intended usage model. > > (4) A gtest is added to test the class's functionality. > > Testing: mach5 tier1, including new gtest Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25964#pullrequestreview-2957100542 From kevinw at openjdk.org Wed Jun 25 08:11:32 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 08:11:32 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v2] In-Reply-To: References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: On Wed, 25 Jun 2025 06:34:10 GMT, Chris Plummer wrote: >> Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). >> >> I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). >> >> Testing (in progress): >> >> - [x] tier1 ci >> - [x] tier1 ci with -XX:StartFlightRecording >> - [x] tier5 ci > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > fix copyright Marked as reviewed by kevinw (Reviewer). src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line 200: > 198: " (expected type JavaThread, CompilerThread, MonitorDeflationThread, AttachListenerThread," + > 199: " DeoptimizeObjectsALotThread, StringDedupThread, NotificationThread, ServiceThread," + > 200: "JfrRecorderThread, or JvmtiAgentThread)", e); nitpicking: space? ------------- PR Review: https://git.openjdk.org/jdk/pull/25960#pullrequestreview-2957100831 PR Review Comment: https://git.openjdk.org/jdk/pull/25960#discussion_r2166077918 From qxing at openjdk.org Wed Jun 25 08:13:43 2025 From: qxing at openjdk.org (Qizheng Xing) Date: Wed, 25 Jun 2025 08:13:43 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 08:03:50 GMT, Stefan Karlsson wrote: >> src/hotspot/os/aix/decoder_aix.hpp line 51: >> >>> 49: >>> 50: #endif // OS_AIX_DECODER_AIX_HPP >>> 51: >> >> Suggestion: >> >> #endif // OS_AIX_DECODER_AIX_HPP > > It looks like there's an extra blankline here (at least in the GitHub view) Got it, I removed the extra trailing new line. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25968#discussion_r2166081004 From qxing at openjdk.org Wed Jun 25 08:13:43 2025 From: qxing at openjdk.org (Qizheng Xing) Date: Wed, 25 Jun 2025 08:13:43 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: > Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: Remove extra trailing new line ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25968/files - new: https://git.openjdk.org/jdk/pull/25968/files/db5e263f..75d90ef0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25968&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25968&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25968.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25968/head:pull/25968 PR: https://git.openjdk.org/jdk/pull/25968 From duke at openjdk.org Wed Jun 25 08:17:42 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 25 Jun 2025 08:17:42 GMT Subject: Withdrawn: 8357086: os::xxx functions returning memory size should return size_t In-Reply-To: References: Message-ID: On Mon, 26 May 2025 12:56:26 GMT, Anton Artemov wrote: > Hi, > > in this PR the output value type for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static bool physical_memory(size_t& value); > > > The return boolean value indicates success, whereas the actual value is assigned to the input argument. The following usage pattern is added: value is initialized to zero -> method call -> in case of a failure `os::abort()` is executed. > > `physical_memory()` has slightly different mechanism, as the `_physical_memory` variable is assigned in different methods on different operating systems, `std::numeric_limits::max()` value is used to indicate an error. > > Later, the return value should be attributed with `[[nodiscard]]`. > > Tested in GHA and Tiers 1-4. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/25450 From mdoerr at openjdk.org Wed Jun 25 08:24:30 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 25 Jun 2025 08:24:30 GMT Subject: RFR: 8360405: [PPC64] some environments don't support mfdscr instruction [v3] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 18:53:17 GMT, Martin Doerr wrote: >> Add back `has_mfdscr()` checks which were recently removed. See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Newline. Thanks for testing and for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25953#issuecomment-3003839009 From ayang at openjdk.org Wed Jun 25 08:39:42 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 25 Jun 2025 08:39:42 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v15] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - merge - version - Merge branch 'master' into pgc-size-policy - revert-aliases - Merge branch 'master' into pgc-size-policy - merge - ... and 11 more: https://git.openjdk.org/jdk/compare/75ce44aa...7f733137 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=14 Stats: 4371 lines in 31 files changed: 520 ins; 3452 del; 399 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From rrich at openjdk.org Wed Jun 25 08:57:30 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 25 Jun 2025 08:57:30 GMT Subject: RFR: 8360405: [PPC64] some environments don't support mfdscr instruction [v3] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 18:53:17 GMT, Martin Doerr wrote: >> Add back `has_mfdscr()` checks which were recently removed. See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Newline. Lgtm. Cheers, Richard. ------------- Marked as reviewed by rrich (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25953#pullrequestreview-2957294349 From mdoerr at openjdk.org Wed Jun 25 09:02:48 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 25 Jun 2025 09:02:48 GMT Subject: RFR: 8360405: [PPC64] some environments don't support mfdscr instruction [v3] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 18:53:17 GMT, Martin Doerr wrote: >> Add back `has_mfdscr()` checks which were recently removed. See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Newline. Thanks for the review! GHA and our internal tests have passed. We'll also need a JDK25 backport. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25953#issuecomment-3003983656 From mdoerr at openjdk.org Wed Jun 25 09:02:49 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 25 Jun 2025 09:02:49 GMT Subject: Integrated: 8360405: [PPC64] some environments don't support mfdscr instruction In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 15:19:50 GMT, Martin Doerr wrote: > Add back `has_mfdscr()` checks which were recently removed. See https://github.com/openjdk/jdk/pull/20262#issuecomment-2999872960 This pull request has now been integrated. Changeset: f71d64fb Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/f71d64fbeb0c196fd825241ff86d3a103d05a842 Stats: 73 lines in 4 files changed: 33 ins; 0 del; 40 mod 8360405: [PPC64] some environments don't support mfdscr instruction Reviewed-by: haosun, rrich ------------- PR: https://git.openjdk.org/jdk/pull/25953 From stefank at openjdk.org Wed Jun 25 09:07:29 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 25 Jun 2025 09:07:29 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: <72uKPIy19K7ZSM2Xa9Kaj2gy43mTb6d1QVf55OX5CTg=.a61fab90-0e05-4d1f-a47b-63cf9c47a30e@github.com> On Wed, 25 Jun 2025 08:13:43 GMT, Qizheng Xing wrote: >> Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. > > Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: > > Remove extra trailing new line Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25968#pullrequestreview-2957328124 From mdoerr at openjdk.org Wed Jun 25 09:12:41 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 25 Jun 2025 09:12:41 GMT Subject: [jdk25] RFR: 8360405: [PPC64] some environments don't support mfdscr instruction Message-ID: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> Clean backport of [JDK-8360405](https://bugs.openjdk.org/browse/JDK-8360405). ------------- Commit messages: - Backport f71d64fbeb0c196fd825241ff86d3a103d05a842 Changes: https://git.openjdk.org/jdk/pull/25972/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25972&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360405 Stats: 73 lines in 4 files changed: 33 ins; 0 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/25972.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25972/head:pull/25972 PR: https://git.openjdk.org/jdk/pull/25972 From xgong at openjdk.org Wed Jun 25 09:16:48 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 25 Jun 2025 09:16:48 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2] In-Reply-To: References: Message-ID: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][3]). > > Two key areas require improvement: > 1. At the Java level, vector indices generated for range validation could be reused for the subsequent gather load operation on architectures with native vector instructions like AArch64 SVE. However, the current implementation prevents compiler reuse of these index vectors due to divergent control flow, potentially impacting performance. > 2. At the compiler IR level, the additional `offset` input for `LoadVectorGather`/`LoadVectorGatherMasked` with subword types increases IR complexity and complicates backend implementation. Furthermore, generating `add` instructions before each memory access negatively impacts performance. > > This patch refactors the implementation at both the Java level and compiler mid-end to improve efficiency and maintainability across different architectures. > > Main changes: > 1. Java-side API refactoring: > - Explicitly passes generated index vectors to hotspot, eliminating duplicate index vectors for gather load instructions on > architectures like AArch64. > 2. C2 compiler IR refactoring: > - Refactors `LoadVectorGather`/`LoadVectorGatherMasked` IR for subword types by removing the memory offset input and incorporating it into the memory base `addr` at the IR level. This simplifies backend implementation, reduces add operations, and unifies the IR across all types. > 3. Backend changes: > - Streamlines X86 implementation of subword gather operations following the removal of the offset input from the IR level. > > Performance: > The performance of the relative JMH improves up to 27% on a X86 AVX512 system. Please see the data below: > > Benchmark Mode Cnt Unit SIZE Before After Gain > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 64 53682.012 52650.325 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 256 14484.252 14255.156 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 1024 3664.900 3595.615 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 4096 908.312 935.269 1.02 > GatherOperationsBenchmark.micr... Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Address review comments - Merge 'jdk:master' into JDK-8355563 - 8355563: VectorAPI: Refactor current implementation of subword gather load API ------------- Changes: https://git.openjdk.org/jdk/pull/25138/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25138&range=01 Stats: 450 lines in 15 files changed: 109 ins; 176 del; 165 mod Patch: https://git.openjdk.org/jdk/pull/25138.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25138/head:pull/25138 PR: https://git.openjdk.org/jdk/pull/25138 From xgong at openjdk.org Wed Jun 25 09:16:48 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 25 Jun 2025 09:16:48 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][3]). > > Two key areas require improvement: > 1. At the Java level, vector indices generated for range validation could be reused for the subsequent gather load operation on architectures with native vector instructions like AArch64 SVE. However, the current implementation prevents compiler reuse of these index vectors due to divergent control flow, potentially impacting performance. > 2. At the compiler IR level, the additional `offset` input for `LoadVectorGather`/`LoadVectorGatherMasked` with subword types increases IR complexity and complicates backend implementation. Furthermore, generating `add` instructions before each memory access negatively impacts performance. > > This patch refactors the implementation at both the Java level and compiler mid-end to improve efficiency and maintainability across different architectures. > > Main changes: > 1. Java-side API refactoring: > - Explicitly passes generated index vectors to hotspot, eliminating duplicate index vectors for gather load instructions on > architectures like AArch64. > 2. C2 compiler IR refactoring: > - Refactors `LoadVectorGather`/`LoadVectorGatherMasked` IR for subword types by removing the memory offset input and incorporating it into the memory base `addr` at the IR level. This simplifies backend implementation, reduces add operations, and unifies the IR across all types. > 3. Backend changes: > - Streamlines X86 implementation of subword gather operations following the removal of the offset input from the IR level. > > Performance: > The performance of the relative JMH improves up to 27% on a X86 AVX512 system. Please see the data below: > > Benchmark Mode Cnt Unit SIZE Before After Gain > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 64 53682.012 52650.325 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 256 14484.252 14255.156 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 1024 3664.900 3595.615 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 4096 908.312 935.269 1.02 > GatherOperationsBenchmark.micr... Hi the above counted loop recognizer patch is merged. Hence I'v rebased this PR to latest jdk master. Following is the new performance data of the subword gather JMHs on X86: Benchmark SIZE Mode Cnt Unit Before After Gain GatherOperationsBenchmark.microByteGather128 64 thrpt 30 ops/ms 44221.691 46837.124 1.05 GatherOperationsBenchmark.microByteGather128 256 thrpt 30 ops/ms 11245.455 12243.045 1.08 GatherOperationsBenchmark.microByteGather128 1024 thrpt 30 ops/ms 2825.246 3096.460 1.09 GatherOperationsBenchmark.microByteGather128 4096 thrpt 30 ops/ms 705.927 775.039 1.09 GatherOperationsBenchmark.microByteGather128_MASK 64 thrpt 30 ops/ms 46783.479 46357.684 0.99 GatherOperationsBenchmark.microByteGather128_MASK 256 thrpt 30 ops/ms 12810.405 12880.347 1.00 GatherOperationsBenchmark.microByteGather128_MASK 1024 thrpt 30 ops/ms 3150.320 3239.281 1.02 GatherOperationsBenchmark.microByteGather128_MASK 4096 thrpt 30 ops/ms 794.151 830.464 1.04 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF 64 thrpt 30 ops/ms 43189.395 47127.449 1.09 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF 256 thrpt 30 ops/ms 11543.128 13196.158 1.14 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF 1024 thrpt 30 ops/ms 2835.053 3300.357 1.16 GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF 4096 thrpt 30 ops/ms 719.470 843.290 1.17 GatherOperationsBenchmark.microByteGather128_NZ_OFF 64 thrpt 30 ops/ms 44143.887 46836.788 1.06 GatherOperationsBenchmark.microByteGather128_NZ_OFF 256 thrpt 30 ops/ms 12206.908 12255.677 1.00 GatherOperationsBenchmark.microByteGather128_NZ_OFF 1024 thrpt 30 ops/ms 3094.232 3095.931 1.00 GatherOperationsBenchmark.microByteGather128_NZ_OFF 4096 thrpt 30 ops/ms 776.293 774.336 0.99 GatherOperationsBenchmark.microByteGather256 64 thrpt 30 ops/ms 46247.977 46803.899 1.01 GatherOperationsBenchmark.microByteGather256 256 thrpt 30 ops/ms 12198.878 12250.315 1.00 GatherOperationsBenchmark.microByteGather256 1024 thrpt 30 ops/ms 3093.356 3100.107 1.00 GatherOperationsBenchmark.microByteGather256 4096 thrpt 30 ops/ms 774.611 774.890 1.00 GatherOperationsBenchmark.microByteGather256_MASK 64 thrpt 30 ops/ms 46873.725 47967.422 1.02 GatherOperationsBenchmark.microByteGather256_MASK 256 thrpt 30 ops/ms 13025.578 13481.477 1.03 GatherOperationsBenchmark.microByteGather256_MASK 1024 thrpt 30 ops/ms 3317.651 3396.208 1.02 GatherOperationsBenchmark.microByteGather256_MASK 4096 thrpt 30 ops/ms 846.0888 864.8407 1.02 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF 64 thrpt 30 ops/ms 44488.365 48769.036 1.09 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF 256 thrpt 30 ops/ms 11988.552 13326.306 1.11 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF 1024 thrpt 30 ops/ms 2851.132 3377.599 1.18 GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF 4096 thrpt 30 ops/ms 734.368 872.331 1.18 GatherOperationsBenchmark.microByteGather256_NZ_OFF 64 thrpt 30 ops/ms 44716.846 46816.743 1.04 GatherOperationsBenchmark.microByteGather256_NZ_OFF 256 thrpt 30 ops/ms 11885.251 12255.916 1.03 GatherOperationsBenchmark.microByteGather256_NZ_OFF 1024 thrpt 30 ops/ms 3016.645 3096.172 1.02 GatherOperationsBenchmark.microByteGather256_NZ_OFF 4096 thrpt 30 ops/ms 756.903 776.363 1.02 GatherOperationsBenchmark.microByteGather512 64 thrpt 30 ops/ms 44742.221 46848.590 1.04 GatherOperationsBenchmark.microByteGather512 256 thrpt 30 ops/ms 12081.443 12236.973 1.01 GatherOperationsBenchmark.microByteGather512 1024 thrpt 30 ops/ms 3086.873 3088.040 1.00 GatherOperationsBenchmark.microByteGather512 4096 thrpt 30 ops/ms 774.243 770.209 0.99 GatherOperationsBenchmark.microByteGather512_MASK 64 thrpt 30 ops/ms 50588.210 48220.741 0.95 GatherOperationsBenchmark.microByteGather512_MASK 256 thrpt 30 ops/ms 13535.785 13675.499 1.01 GatherOperationsBenchmark.microByteGather512_MASK 1024 thrpt 30 ops/ms 3355.724 3421.323 1.01 GatherOperationsBenchmark.microByteGather512_MASK 4096 thrpt 30 ops/ms 859.103 872.009 1.01 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF 64 thrpt 30 ops/ms 44139.269 48320.364 1.09 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF 256 thrpt 30 ops/ms 12500.697 13801.124 1.10 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF 1024 thrpt 30 ops/ms 3135.082 3492.312 1.11 GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF 4096 thrpt 30 ops/ms 794.338 897.249 1.12 GatherOperationsBenchmark.microByteGather512_NZ_OFF 64 thrpt 30 ops/ms 45754.147 46421.300 1.01 GatherOperationsBenchmark.microByteGather512_NZ_OFF 256 thrpt 30 ops/ms 12133.467 12253.848 1.00 GatherOperationsBenchmark.microByteGather512_NZ_OFF 1024 thrpt 30 ops/ms 3074.637 3091.207 1.00 GatherOperationsBenchmark.microByteGather512_NZ_OFF 4096 thrpt 30 ops/ms 755.250 774.367 1.02 GatherOperationsBenchmark.microByteGather64 64 thrpt 30 ops/ms 58625.196 59263.141 1.01 GatherOperationsBenchmark.microByteGather64 256 thrpt 30 ops/ms 15745.329 17377.889 1.10 GatherOperationsBenchmark.microByteGather64 1024 thrpt 30 ops/ms 4121.997 4471.261 1.08 GatherOperationsBenchmark.microByteGather64 4096 thrpt 30 ops/ms 1044.419 1125.721 1.07 GatherOperationsBenchmark.microByteGather64_MASK 64 thrpt 30 ops/ms 48754.131 49028.183 1.00 GatherOperationsBenchmark.microByteGather64_MASK 256 thrpt 30 ops/ms 13248.349 13537.811 1.02 GatherOperationsBenchmark.microByteGather64_MASK 1024 thrpt 30 ops/ms 3308.839 3356.109 1.01 GatherOperationsBenchmark.microByteGather64_MASK 4096 thrpt 30 ops/ms 843.688 859.161 1.01 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF 64 thrpt 30 ops/ms 43523.662 48868.373 1.12 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF 256 thrpt 30 ops/ms 12242.984 13519.719 1.10 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF 1024 thrpt 30 ops/ms 3055.772 3394.342 1.11 GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF 4096 thrpt 30 ops/ms 754.532 870.302 1.15 GatherOperationsBenchmark.microByteGather64_NZ_OFF 64 thrpt 30 ops/ms 51858.935 58869.325 1.13 GatherOperationsBenchmark.microByteGather64_NZ_OFF 256 thrpt 30 ops/ms 14235.928 17381.117 1.22 GatherOperationsBenchmark.microByteGather64_NZ_OFF 1024 thrpt 30 ops/ms 3684.506 4483.270 1.21 GatherOperationsBenchmark.microByteGather64_NZ_OFF 4096 thrpt 30 ops/ms 922.368 1127.66 1.22 GatherOperationsBenchmark.microShortGather128 64 thrpt 30 ops/ms 44399.870 45016.972 1.01 GatherOperationsBenchmark.microShortGather128 256 thrpt 30 ops/ms 11679.775 12629.207 1.08 GatherOperationsBenchmark.microShortGather128 1024 thrpt 30 ops/ms 1277.328 3206.762 2.51 GatherOperationsBenchmark.microShortGather128 4096 thrpt 30 ops/ms 761.846 817.159 1.07 GatherOperationsBenchmark.microShortGather128_MASK 64 thrpt 30 ops/ms 37165.399 36484.534 0.98 GatherOperationsBenchmark.microShortGather128_MASK 256 thrpt 30 ops/ms 9875.757 9958.754 1.00 GatherOperationsBenchmark.microShortGather128_MASK 1024 thrpt 30 ops/ms 2519.580 2554.210 1.01 GatherOperationsBenchmark.microShortGather128_MASK 4096 thrpt 30 ops/ms 615.867 652.092 1.05 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF 64 thrpt 30 ops/ms 34049.203 33669.772 0.98 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF 256 thrpt 30 ops/ms 9010.587 8779.455 0.97 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF 1024 thrpt 30 ops/ms 2253.432 2415.560 1.07 GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF 4096 thrpt 30 ops/ms 559.163 577.659 1.03 GatherOperationsBenchmark.microShortGather128_NZ_OFF 64 thrpt 30 ops/ms 39892.023 43978.899 1.10 GatherOperationsBenchmark.microShortGather128_NZ_OFF 256 thrpt 30 ops/ms 10697.817 12424.189 1.16 GatherOperationsBenchmark.microShortGather128_NZ_OFF 1024 thrpt 30 ops/ms 2681.286 3145.941 1.17 GatherOperationsBenchmark.microShortGather128_NZ_OFF 4096 thrpt 30 ops/ms 682.330 803.364 1.17 GatherOperationsBenchmark.microShortGather256 64 thrpt 30 ops/ms 42335.033 43194.212 1.02 GatherOperationsBenchmark.microShortGather256 256 thrpt 30 ops/ms 10760.015 11149.020 1.03 GatherOperationsBenchmark.microShortGather256 1024 thrpt 30 ops/ms 2688.410 2806.389 1.04 GatherOperationsBenchmark.microShortGather256 4096 thrpt 30 ops/ms 675.401 703.849 1.04 GatherOperationsBenchmark.microShortGather256_MASK 64 thrpt 30 ops/ms 38760.990 41844.197 1.07 GatherOperationsBenchmark.microShortGather256_MASK 256 thrpt 30 ops/ms 11339.217 10951.141 0.96 GatherOperationsBenchmark.microShortGather256_MASK 1024 thrpt 30 ops/ms 2840.081 2718.823 0.95 GatherOperationsBenchmark.microShortGather256_MASK 4096 thrpt 30 ops/ms 725.334 696.343 0.96 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF 64 thrpt 30 ops/ms 39059.271 42199.055 1.08 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF 256 thrpt 30 ops/ms 10440.036 11467.941 1.09 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF 1024 thrpt 30 ops/ms 2563.378 2790.541 1.08 GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF 4096 thrpt 30 ops/ms 642.642 751.287 1.16 GatherOperationsBenchmark.microShortGather256_NZ_OFF 64 thrpt 30 ops/ms 38963.881 42675.099 1.09 GatherOperationsBenchmark.microShortGather256_NZ_OFF 256 thrpt 30 ops/ms 10628.469 11168.949 1.05 GatherOperationsBenchmark.microShortGather256_NZ_OFF 1024 thrpt 30 ops/ms 2702.591 2806.074 1.03 GatherOperationsBenchmark.microShortGather256_NZ_OFF 4096 thrpt 30 ops/ms 683.690 704.498 1.03 GatherOperationsBenchmark.microShortGather512 64 thrpt 30 ops/ms 41117.094 41269.397 1.00 GatherOperationsBenchmark.microShortGather512 256 thrpt 30 ops/ms 10565.519 10652.618 1.00 GatherOperationsBenchmark.microShortGather512 1024 thrpt 30 ops/ms 2681.894 2705.963 1.00 GatherOperationsBenchmark.microShortGather512 4096 thrpt 30 ops/ms 673.821 679.631 1.00 GatherOperationsBenchmark.microShortGather512_MASK 64 thrpt 30 ops/ms 41318.510 42372.271 1.02 GatherOperationsBenchmark.microShortGather512_MASK 256 thrpt 30 ops/ms 11587.465 10674.598 0.92 GatherOperationsBenchmark.microShortGather512_MASK 1024 thrpt 30 ops/ms 2902.731 2629.739 0.90 GatherOperationsBenchmark.microShortGather512_MASK 4096 thrpt 30 ops/ms 741.546 671.124 0.90 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF 64 thrpt 30 ops/ms 39524.127 40623.622 1.02 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF 256 thrpt 30 ops/ms 10642.152 11392.025 1.07 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF 1024 thrpt 30 ops/ms 2650.143 2819.185 1.06 GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF 4096 thrpt 30 ops/ms 672.674 739.882 1.09 GatherOperationsBenchmark.microShortGather512_NZ_OFF 64 thrpt 30 ops/ms 39861.745 41600.729 1.04 GatherOperationsBenchmark.microShortGather512_NZ_OFF 256 thrpt 30 ops/ms 10531.312 10586.255 1.00 GatherOperationsBenchmark.microShortGather512_NZ_OFF 1024 thrpt 30 ops/ms 2667.839 2678.026 1.00 GatherOperationsBenchmark.microShortGather512_NZ_OFF 4096 thrpt 30 ops/ms 667.607 677.434 1.01 GatherOperationsBenchmark.microShortGather64 64 thrpt 30 ops/ms 45716.109 50726.590 1.10 GatherOperationsBenchmark.microShortGather64 256 thrpt 30 ops/ms 12383.842 13608.216 1.09 GatherOperationsBenchmark.microShortGather64 1024 thrpt 30 ops/ms 3025.989 3443.097 1.13 GatherOperationsBenchmark.microShortGather64 4096 thrpt 30 ops/ms 771.995 897.890 1.16 GatherOperationsBenchmark.microShortGather64_MASK 64 thrpt 30 ops/ms 39758.975 39155.984 0.98 GatherOperationsBenchmark.microShortGather64_MASK 256 thrpt 30 ops/ms 10594.260 10622.428 1.00 GatherOperationsBenchmark.microShortGather64_MASK 1024 thrpt 30 ops/ms 2654.849 2771.674 1.04 GatherOperationsBenchmark.microShortGather64_MASK 4096 thrpt 30 ops/ms 677.508 684.557 1.01 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF 64 thrpt 30 ops/ms 37729.191 40552.172 1.07 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF 256 thrpt 30 ops/ms 10087.184 11121.611 1.10 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF 1024 thrpt 30 ops/ms 2510.133 2788.778 1.11 GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF 4096 thrpt 30 ops/ms 642.370 658.808 1.02 GatherOperationsBenchmark.microShortGather64_NZ_OFF 64 thrpt 30 ops/ms 40632.099 50718.706 1.24 GatherOperationsBenchmark.microShortGather64_NZ_OFF 256 thrpt 30 ops/ms 10984.671 14155.624 1.28 GatherOperationsBenchmark.microShortGather64_NZ_OFF 1024 thrpt 30 ops/ms 2733.285 3668.118 1.34 GatherOperationsBenchmark.microShortGather64_NZ_OFF 4096 thrpt 30 ops/ms 679.524 932.748 1.37 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-3004026787 From xgong at openjdk.org Wed Jun 25 09:16:48 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 25 Jun 2025 09:16:48 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: <-_jqrYt3RDwwrdFt12v0cv8yefopGBAKLjUg8B6lBTM=.e60b57b8-c867-43a1-a793-093730810b3d@github.com> On Mon, 2 Jun 2025 10:48:25 GMT, Emanuel Peter wrote: >>> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> > >>> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >>> >>> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! > > @XiaohongGong I reviewed https://github.com/openjdk/jdk/pull/25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? Hi @eme64 I'v updated the patch to fix the comment issue you pointed above. Could you please help take a look at again? Thanks a lot! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-3004029058 From aph at openjdk.org Wed Jun 25 09:21:29 2025 From: aph at openjdk.org (Andrew Haley) Date: Wed, 25 Jun 2025 09:21:29 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method In-Reply-To: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Tue, 24 Jun 2025 15:38:00 GMT, Mikhail Ablakatov wrote: > Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. > > This has passed tier1-3 and jcstress testing on AArch64. One thing that looks a little odd: why do you maintain two separate lists and two sets of trampoline stub handlers? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25954#issuecomment-3004044956 From shade at openjdk.org Wed Jun 25 09:40:29 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 25 Jun 2025 09:40:29 GMT Subject: [jdk25] RFR: 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 06:54:35 GMT, Igor Veresov wrote: > 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded Indeed, and this is what we do in other places as well. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25969#pullrequestreview-2957444926 From mdoerr at openjdk.org Wed Jun 25 09:52:37 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 25 Jun 2025 09:52:37 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 19:26:11 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into 8358821-patch-verified-entry > - 2nd try at arm fix > - rename arm_with to guard_with > - arm32 fix > - s390 fix courtesy of Amit Kumar > - remove is_sigill_not_entrant > - more cleanup > - more TheRealMDoerr suggestions > - TheRealMDoerr suggestions > - remove trailing space > - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c src/hotspot/share/gc/shared/barrierSetNMethod.hpp line 52: > 50: > 51: public: > 52: BarrierSetNMethod() : _current_phase(initial) {} @fisk: The initial value doesn't match our initialization in the nmethod entry barrier code were we use 0. That causes all new nmethods to run through the barrier code when they are called for the first time. I think that is unnecessary and it slows down the startup a bit. All oops should already be correct after the nmethod got installed. And for ZGC, we call `nmethod_patch_barriers` in `ZNMethod::register_nmethod`. So, I don't see any need to execute the barrier code. We could change the initialization to use `_current_phase`. Do you agree? Should I file a new JBS issue? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2166306862 From mdoerr at openjdk.org Wed Jun 25 09:59:34 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 25 Jun 2025 09:59:34 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 09:49:25 GMT, Martin Doerr wrote: >> Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: >> >> - Merge branch 'master' into 8358821-patch-verified-entry >> - 2nd try at arm fix >> - rename arm_with to guard_with >> - arm32 fix >> - s390 fix courtesy of Amit Kumar >> - remove is_sigill_not_entrant >> - more cleanup >> - more TheRealMDoerr suggestions >> - TheRealMDoerr suggestions >> - remove trailing space >> - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c > > src/hotspot/share/gc/shared/barrierSetNMethod.hpp line 52: > >> 50: >> 51: public: >> 52: BarrierSetNMethod() : _current_phase(initial) {} > > @fisk: The initial value doesn't match our initialization in the nmethod entry barrier code were we use 0. That causes all new nmethods to run through the barrier code when they are called for the first time. I think that is unnecessary and it slows down the startup a bit. All oops should already be correct after the nmethod got installed. And for ZGC, we call `nmethod_patch_barriers` in `ZNMethod::register_nmethod`. So, I don't see any need to execute the barrier code. We could change the initialization to use `_current_phase`. Do you agree? Should I file a new JBS issue? In other words, I think `disarm(nm);` is missing for some GCs. ZGC has it in `ZNMethod::register_nmethod`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2166322229 From kevinw at openjdk.org Wed Jun 25 10:06:13 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 10:06:13 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch Message-ID: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. ------------- Commit messages: - ThreadSnapshot comment - Merge remote-tracking branch 'upstream/master' into 8359870_threadexited - 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch Changes: https://git.openjdk.org/jdk/pull/25958/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359870 Stats: 10 lines in 4 files changed: 8 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25958.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25958/head:pull/25958 PR: https://git.openjdk.org/jdk/pull/25958 From alanb at openjdk.org Wed Jun 25 10:18:27 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 25 Jun 2025 10:18:27 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch In-Reply-To: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Tue, 24 Jun 2025 17:00:19 GMT, Kevin Walls wrote: > ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java line 184: > 182: if (snapshot == null) { > 183: return; // Terminated > 184: } Would it be possible to use this instance, otherwise the thread counts will be confusing. --- a/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java +++ b/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java @@ -178,8 +178,11 @@ private static void dumpThreads(ThreadContainer container, TextWriter writer) { } private static void dumpThread(Thread thread, TextWriter writer) { - ThreadSnapshot snapshot = ThreadSnapshot.of(thread); Instant now = Instant.now(); + ThreadSnapshot snapshot = ThreadSnapshot.of(thread); + if (snapshot == null) { + return; // thread terminated + } Thread.State state = snapshot.threadState(); writer.println("#" + thread.threadId() + " "" + snapshot.threadName() + "" " + (thread.isVirtual() ? "virtual " : "") + state + " " + now); @@ -284,8 +287,9 @@ private static void dumpThreads(ThreadContainer container, JsonWriter jsonWriter Iterator threads = container.threads().iterator(); while (threads.hasNext()) { Thread thread = threads.next(); - dumpThread(thread, jsonWriter); - threadCount++; + if (dumpThread(thread, jsonWriter)) { + threadCount++; + } } jsonWriter.endArray(); // threads @@ -305,9 +309,12 @@ private static void dumpThreads(ThreadContainer container, JsonWriter jsonWriter * Write a thread to the given JSON writer. * @throws UncheckedIOException if an I/O error occurs */ - private static void dumpThread(Thread thread, JsonWriter jsonWriter) { + private static boolean dumpThread(Thread thread, JsonWriter jsonWriter) { Instant now = Instant.now(); ThreadSnapshot snapshot = ThreadSnapshot.of(thread); + if (snapshot == null) { + return false; // thread terminated + } Thread.State state = snapshot.threadState(); StackTraceElement[] stackTrace = snapshot.stackTrace(); @@ -369,6 +376,7 @@ private static void dumpThread(Thread thread, JsonWriter jsonWriter) { } jsonWriter.endObject(); + return true; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2166358603 From kevinw at openjdk.org Wed Jun 25 10:21:27 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 10:21:27 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Wed, 25 Jun 2025 10:15:54 GMT, Alan Bateman wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java line 184: > >> 182: if (snapshot == null) { >> 183: return; // Terminated >> 184: } > > Would it be possible to use this instance, otherwise the thread counts will be confusing. > > > --- a/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java > +++ b/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java > @@ -178,8 +178,11 @@ private static void dumpThreads(ThreadContainer container, TextWriter writer) { > } > > private static void dumpThread(Thread thread, TextWriter writer) { > - ThreadSnapshot snapshot = ThreadSnapshot.of(thread); > Instant now = Instant.now(); > + ThreadSnapshot snapshot = ThreadSnapshot.of(thread); > + if (snapshot == null) { > + return; // thread terminated > + } > Thread.State state = snapshot.threadState(); > writer.println("#" + thread.threadId() + " "" + snapshot.threadName() > + "" " + (thread.isVirtual() ? "virtual " : "") + state + " " + now); > @@ -284,8 +287,9 @@ private static void dumpThreads(ThreadContainer container, JsonWriter jsonWriter > Iterator threads = container.threads().iterator(); > while (threads.hasNext()) { > Thread thread = threads.next(); > - dumpThread(thread, jsonWriter); > - threadCount++; > + if (dumpThread(thread, jsonWriter)) { > + threadCount++; > + } > } > jsonWriter.endArray(); // threads > > @@ -305,9 +309,12 @@ private static void dumpThreads(ThreadContainer container, JsonWriter jsonWriter > * Write a thread to the given JSON writer. > * @throws UncheckedIOException if an I/O error occurs > */ > - private static void dumpThread(Thread thread, JsonWriter jsonWriter) { > + private static boolean dumpThread(Thread thread, JsonWriter jsonWriter) { > Instant now = Instant.now(); > ThreadSnapshot snapshot = ThreadSnapshot.of(thread); > + if (snapshot == null) { > + return false; // thread terminated > + } > Thread.State state = snapshot.threadState(); > StackTraceElement[] stackTrace = snapshot.stackTrace(); > > @@ -369,6 +376,7 @@ private static void dumpThread(Thread thread, JsonWriter jsonWriter) { > } > > jsonWriter.endObject(); > + return true; > } Yes, will do that to fix the threadCount... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2166363495 From mhaessig at openjdk.org Wed Jun 25 10:22:30 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 25 Jun 2025 10:22:30 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: <6dMdy8vwFeJGFcIk2QEbM5IY1MQmGp0t5VslDC5CpNE=.8f14c057-fd5d-4317-a581-ce42e397673d@github.com> On Wed, 25 Jun 2025 08:13:43 GMT, Qizheng Xing wrote: >> Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. > > Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: > > Remove extra trailing new line Marked as reviewed by mhaessig (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25968#pullrequestreview-2957579429 From jbhateja at openjdk.org Wed Jun 25 10:35:08 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 25 Jun 2025 10:35:08 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction Message-ID: Intel@ AVX10 ISA [1] extensions added new floating point MIN/MAX instructions which comply with definitions in IEEE-754-2019 standard section 9.6 and can directly emulate Math.min/max semantics without the need for any special handling for NaN, +0.0 or -0.0 detection. **The following pseudo-code describes the existing algorithm for min/max[FD]:** Move the non-negative value to the second operand; this will ensure that we correctly handle 0.0 and -0.0 values, if values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. Existing MINPS and MAXPS semantics only check for NaN as the second operand; hence, we need special handling to check for NaN at the first operand. btmp = (b < +0.0) ? a : b atmp = (b < +0.0) ? b : a Tmp = Max_Float(atmp , btmp) Res = (atmp == NaN) ? atmp : Tmp For min[FD] we need a small tweak in the above algorithm, i.e., move the non-negative value to the first operand, this will ensure that we correctly select -0.0 if both the operands being compared are 0.0 or -0.0. btmp = (b < +0.0) ? b : a atmp = (b < +0.0) ? a : b Tmp = Max_Float(atmp , btmp) Res = (atmp == NaN) ? atmp : Tmp Thus, we need additional special handling for NaNs and +/-0.0 to compute floating-point min/max values to comply with the semantics of Math.max/min APIs using existing MINPS / MAXPS instructions. AVX10.2 added a new instruction, VPMINMAX[SH,SS,SD]/[PH,PS,PD], which comprehensively handles special cases, thereby eliminating the need for special handling. Patch emits new instructions for reduction and non-reduction operations for single, double, and Float16 type. Kindly review and share your feedback. Best Regards, Jatin [1] https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 ------------- Commit messages: - Extending the patch to cover reduction operations - 8360116: Add support for AVX10 floating point minmax instruction Changes: https://git.openjdk.org/jdk/pull/25914/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360116 Stats: 420 lines in 7 files changed: 379 ins; 4 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/25914.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25914/head:pull/25914 PR: https://git.openjdk.org/jdk/pull/25914 From jbhateja at openjdk.org Wed Jun 25 10:41:04 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 25 Jun 2025 10:41:04 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v2] In-Reply-To: References: Message-ID: > Intel@ AVX10 ISA [1] extensions added new floating point MIN/MAX instructions which comply with definitions in IEEE-754-2019 standard section 9.6 and can directly emulate Math.min/max semantics without the need for any special handling for NaN, +0.0 or -0.0 detection. > > **The following pseudo-code describes the existing algorithm for min/max[FD]:** > > Move the non-negative value to the second operand; this will ensure that we correctly handle 0.0 and -0.0 values, if values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. Existing MINPS and MAXPS semantics only check for NaN as the second operand; hence, we need special handling to check for NaN at the first operand. > > btmp = (b < +0.0) ? a : b > atmp = (b < +0.0) ? b : a > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > For min[FD] we need a small tweak in the above algorithm, i.e., move the non-negative value to the first operand, this will ensure that we correctly select -0.0 if both the operands being compared are 0.0 or -0.0. > > btmp = (b < +0.0) ? b : a > atmp = (b < +0.0) ? a : b > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > Thus, we need additional special handling for NaNs and +/-0.0 to compute floating-point min/max values to comply with the semantics of Math.max/min APIs using existing MINPS / MAXPS instructions. AVX10.2 added a new instruction, VPMINMAX[SH,SS,SD]/[PH,PS,PD], which comprehensively handles special cases, thereby eliminating the need for special handling. > > Patch emits new instructions for reduction and non-reduction operations for single, double, and Float16 type. > > Kindly review and share your feedback. > > Best Regards, > Jatin > > [1] https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Update comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25914/files - new: https://git.openjdk.org/jdk/pull/25914/files/e7753571..b6e55157 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25914.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25914/head:pull/25914 PR: https://git.openjdk.org/jdk/pull/25914 From coleenp at openjdk.org Wed Jun 25 11:09:36 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 25 Jun 2025 11:09:36 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v5] In-Reply-To: References: <8zcivFIhzh6QymXS119iTVwdzyufdXBaeOR4_dEjlig=.b1683573-13cb-4b40-baf3-8b609680e86f@github.com> Message-ID: On Tue, 24 Jun 2025 09:12:00 GMT, Anton Artemov wrote: >> test/jtreg-ext/requires/VMProps.java line 424: >> >>> 422: * Note: Lightweight locking does not support RTM (for now). >>> 423: */ >>> 424: protected String vmRTMCompiler() { >> >> [JDK-8358542](https://bugs.openjdk.org/browse/JDK-8358542) exists to remove this so you would need to add that bug id to this PR. However, it seems the bug management for this has gotten completely messed up so you may need to scrap this PR and file a new bug and PR for this part. > > Issue added. Since you added issue JDK-8358542, can you also remove the function under this: vmRTMCPU - that function isn't referenced anywhere either. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2166447777 From coleenp at openjdk.org Wed Jun 25 11:13:28 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 25 Jun 2025 11:13:28 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 08:13:43 GMT, Qizheng Xing wrote: >> Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. > > Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: > > Remove extra trailing new line src/hotspot/share/utilities/packedTable.hpp line 27: > 25: #ifndef SHARE_UTILITIES_PACKEDTABLE_HPP > 26: #define SHARE_UTILITIES_PACKEDTABLE_HPP > 27: I just backported this to JDK 25. I wonder if you should omit this change and put it under another CR so we can backport that too. Or not backport this, assuming that JDK 25 will never have double inclusions of this file ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25968#discussion_r2166454866 From jsikstro at openjdk.org Wed Jun 25 11:14:40 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 25 Jun 2025 11:14:40 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit Message-ID: Hello, PROPERFMT is defined as the format string "%zu%s", which expects a size_t as input argument. When used in combination with PROPERFMTARGS, which uses the templated byte_size_in_proper_units, the byte size may not be size_t if the input is some other type. To minimize confusion, PROPERFMTARGS should always use the size_t template specilization of byte_size_in_proper_units, to match PROPERFMT. Places that use byte_size_in_proper_units with other types can still use it, but should use their own format strings instead of PROPERFMT. ProcSmapsSummary::print_on in memMapPrinter_macosx is the only place that uses PROPERFMTARGS with a type that is not size_t. I have changed those places to use the expanded version of the macro, which uses the templated version of byte_size_in_proper_unit instead. Testing: * Currently running Oracle's tier1-2 ------------- Commit messages: - 8360515: PROPERFMTARGS should always use size_t template specialization for unit Changes: https://git.openjdk.org/jdk/pull/25975/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25975&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360515 Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25975/head:pull/25975 PR: https://git.openjdk.org/jdk/pull/25975 From alanb at openjdk.org Wed Jun 25 11:16:46 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 25 Jun 2025 11:16:46 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v5] In-Reply-To: References: Message-ID: On Tue, 24 Jun 2025 13:25:20 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Addressed reviewer's comment test/jdk/jdk/internal/vm/Continuation/Fuzz.java line 477: > 475: boolean shouldPin() { > 476: // Returns false since we never pin after we removed legacy locking. > 477: return traceHas(Op.PIN::contains) && false; Are you planning to remove this method and update verifyPin, or maybe there will be a follow-on JBS issue for this cleanup? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2166459367 From jsikstro at openjdk.org Wed Jun 25 11:19:47 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 25 Jun 2025 11:19:47 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit [v2] In-Reply-To: References: Message-ID: > Hello, > > PROPERFMT is defined as the format string "%zu%s", which expects a size_t as input argument. When used in combination with PROPERFMTARGS, which uses the templated byte_size_in_proper_units, the byte size may not be size_t if the input is some other type. > > To minimize confusion, PROPERFMTARGS should always use the size_t template specilization of byte_size_in_proper_units, to match PROPERFMT. Places that use byte_size_in_proper_units with other types can still use it, but should use their own format strings instead of PROPERFMT. > > ProcSmapsSummary::print_on in memMapPrinter_macosx is the only place that uses PROPERFMTARGS with a type that is not size_t. I have changed those places to use the expanded version of the macro, which uses the templated version of byte_size_in_proper_unit instead. > > Testing: > * Currently running Oracle's tier1-2 Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into JDK-8360515_properfmtargs - 8360515: PROPERFMTARGS should always use size_t template specialization for unit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25975/files - new: https://git.openjdk.org/jdk/pull/25975/files/1e340fab..0d51cb39 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25975&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25975&range=00-01 Stats: 1778 lines in 93 files changed: 1075 ins; 359 del; 344 mod Patch: https://git.openjdk.org/jdk/pull/25975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25975/head:pull/25975 PR: https://git.openjdk.org/jdk/pull/25975 From duke at openjdk.org Wed Jun 25 11:26:22 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 25 Jun 2025 11:26:22 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v6] In-Reply-To: References: Message-ID: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8359437: Removed vmRTMCPU from VMProps.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25847/files - new: https://git.openjdk.org/jdk/pull/25847/files/02565157..6534afaa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25847&range=04-05 Stats: 8 lines in 1 file changed: 0 ins; 8 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25847/head:pull/25847 PR: https://git.openjdk.org/jdk/pull/25847 From coleenp at openjdk.org Wed Jun 25 11:28:42 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 25 Jun 2025 11:28:42 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v6] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:26:22 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Removed vmRTMCPU from VMProps.java Looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2957771841 From duke at openjdk.org Wed Jun 25 11:38:35 2025 From: duke at openjdk.org (Anton Artemov) Date: Wed, 25 Jun 2025 11:38:35 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v5] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:13:32 GMT, Alan Bateman wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8359437: Addressed reviewer's comment > > test/jdk/jdk/internal/vm/Continuation/Fuzz.java line 477: > >> 475: boolean shouldPin() { >> 476: // Returns false since we never pin after we removed legacy locking. >> 477: return traceHas(Op.PIN::contains) && false; > > Are you planning to remove this method and update verifyPin, or maybe there will be a follow-on JBS issue for this cleanup? Removal will be done in phase 2. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2166498312 From kevinw at openjdk.org Wed Jun 25 11:50:42 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 11:50:42 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v2] In-Reply-To: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: <3i_6AJxEAX1a4dfB2lYuPRLSfF6UInmOF1KG5_PfTpA=.f54ff353-78e7-4ad8-9906-fd9a5963ea42@github.com> > ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: ThreadDumper thread count ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25958/files - new: https://git.openjdk.org/jdk/pull/25958/files/33248d9d..e4a7b546 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=00-01 Stats: 11 lines in 1 file changed: 6 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25958.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25958/head:pull/25958 PR: https://git.openjdk.org/jdk/pull/25958 From kevinw at openjdk.org Wed Jun 25 11:53:44 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 11:53:44 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v3] In-Reply-To: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: > ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: Correct THROW macro ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25958/files - new: https://git.openjdk.org/jdk/pull/25958/files/e4a7b546..089dcf49 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25958.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25958/head:pull/25958 PR: https://git.openjdk.org/jdk/pull/25958 From kevinw at openjdk.org Wed Jun 25 12:02:11 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 12:02:11 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v4] In-Reply-To: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: > ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: - newline - Test fails on minimal VM: require jvmti feature ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25958/files - new: https://git.openjdk.org/jdk/pull/25958/files/089dcf49..0dc95941 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=02-03 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25958.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25958/head:pull/25958 PR: https://git.openjdk.org/jdk/pull/25958 From alanb at openjdk.org Wed Jun 25 12:33:28 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 25 Jun 2025 12:33:28 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v4] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: <-NYiRIpr54nXkuUcOJ0O1zr_xo5PtSULMbMKsxjoYUY=.4a44e324-584c-4ff9-8d24-8884563b07e9@github.com> On Wed, 25 Jun 2025 12:02:11 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - newline > - Test fails on minimal VM: require jvmti feature src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java line 317: > 315: ThreadSnapshot snapshot = ThreadSnapshot.of(thread); > 316: if (snapshot == null) { > 317: return false; // Terminated This is okay. Do you mind change these two to say "thread terminated"? We will eventually replace this as it's only a temporary that this scenario is even possible. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2166601207 From mablakatov at openjdk.org Wed Jun 25 12:51:28 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 25 Jun 2025 12:51:28 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Wed, 25 Jun 2025 09:19:20 GMT, Andrew Haley wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > One thing that looks a little odd: why do you maintain two separate lists and two sets of trampoline stub handlers? Hi @theRealAph , thank you for taking a look. JIC, these are hash tables with (target address; list of requests (call site offsets)) as (k;v) pairs. I've considered unifying the two hash tables. The problem here is the keys. We need a way to distinguish between runtime call keys and static call keys as we iterate through the table. We could use `CodeCache::is_non_nmethod(address addr)` for that but this would only work when segmented code cache is enabled as far as I understand. We could use something like `Pair` for keys instead and use the `first` to distinguish between runtime and static calls. I'd need to extend `template<...> class Pair` so it properly supports hashing and comparison (at least equality) for this to work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25954#issuecomment-3004652098 From kbarrett at openjdk.org Wed Jun 25 12:57:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 25 Jun 2025 12:57:32 GMT Subject: RFR: 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description In-Reply-To: References: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> Message-ID: On Wed, 25 Jun 2025 07:24:53 GMT, Joel Sikstr?m wrote: >> Please review this change that renames Deferred<> to DeferredStatic<>, to >> better reflect the intended usage. (This involves renaming the source file.) >> This change also revises the documentation comment for the class to better >> describe the intended usage. >> >> In addition, there are a number of cleanups: >> >> (1) The include guard didn't get updated when the name was previously changed >> to Deferred. It's updated here to reflect the new name. >> >> (2) There were problems with the include block that are fixed here. >> >> (3) The changes from JDK-8359923 are backed out. They aren't useful with the >> intended usage model. >> >> (4) A gtest is added to test the class's functionality. >> >> Testing: mach5 tier1, including new gtest > > test/hotspot/gtest/utilities/test_deferredStatic.cpp line 31: > >> 29: #include >> 30: >> 31: #include "unittest.hpp" > > I see you've discussed the order of this include in https://github.com/openjdk/jdk/pull/25927#discussion_r2164307139 already. I agree with David that we should document an agreed upon style. I don't have anything against this style. https://bugs.openjdk.org/browse/JDK-8360524 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25964#discussion_r2166648738 From kevinw at openjdk.org Wed Jun 25 13:02:03 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 13:02:03 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: > ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: - comment update - comment update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25958/files - new: https://git.openjdk.org/jdk/pull/25958/files/0dc95941..d8143785 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=03-04 Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25958.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25958/head:pull/25958 PR: https://git.openjdk.org/jdk/pull/25958 From kevinw at openjdk.org Wed Jun 25 13:02:03 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 13:02:03 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v4] In-Reply-To: <-NYiRIpr54nXkuUcOJ0O1zr_xo5PtSULMbMKsxjoYUY=.4a44e324-584c-4ff9-8d24-8884563b07e9@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <-NYiRIpr54nXkuUcOJ0O1zr_xo5PtSULMbMKsxjoYUY=.4a44e324-584c-4ff9-8d24-8884563b07e9@github.com> Message-ID: On Wed, 25 Jun 2025 12:30:35 GMT, Alan Bateman wrote: >> Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: >> >> - newline >> - Test fails on minimal VM: require jvmti feature > > src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java line 317: > >> 315: ThreadSnapshot snapshot = ThreadSnapshot.of(thread); >> 316: if (snapshot == null) { >> 317: return false; // Terminated > > This is okay. Do you mind change these two to say "thread terminated"? We will eventually replace this as it's only a temporary that this scenario is even possible. Sure! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2166653464 From haosun at openjdk.org Wed Jun 25 13:46:30 2025 From: haosun at openjdk.org (Hao Sun) Date: Wed, 25 Jun 2025 13:46:30 GMT Subject: [jdk25] RFR: 8360405: [PPC64] some environments don't support mfdscr instruction In-Reply-To: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> References: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> Message-ID: On Wed, 25 Jun 2025 09:08:38 GMT, Martin Doerr wrote: > Clean backport of [JDK-8360405](https://bugs.openjdk.org/browse/JDK-8360405). Thanks for your fix. As I tested with jdk25 branch in my environment(QEMU), it failed to run `java -version` before this PR, and it can pass with this PR. ------------- Marked as reviewed by haosun (Committer). PR Review: https://git.openjdk.org/jdk/pull/25972#pullrequestreview-2958221350 From mbaesken at openjdk.org Wed Jun 25 14:14:39 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 25 Jun 2025 14:14:39 GMT Subject: RFR: 8360518: Docker tests do not work when asan is configured Message-ID: When the address sanitizer ASAN is configured, we run into errors in the docker tests. Example hotspot/jtreg/containers/docker/DockerBasicTest.java : [STDOUT] /jdk/bin/java: error while loading shared libraries: libasan.so.8: cannot open shared object file: No such file or directory Reason is that the asan-enabled binaries need additional dependencies and those are not available in the current docker/container setups. Maybe we should skip those tests when asan is enabled. ------------- Commit messages: - JDK-8360518 Changes: https://git.openjdk.org/jdk/pull/25980/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25980&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360518 Stats: 23 lines in 23 files changed: 23 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25980.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25980/head:pull/25980 PR: https://git.openjdk.org/jdk/pull/25980 From mablakatov at openjdk.org Wed Jun 25 14:29:29 2025 From: mablakatov at openjdk.org (Mikhail Ablakatov) Date: Wed, 25 Jun 2025 14:29:29 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Wed, 25 Jun 2025 09:19:20 GMT, Andrew Haley wrote: >> Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. >> >> The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. >> >> This has passed tier1-3 and jcstress testing on AArch64. > > One thing that looks a little odd: why do you maintain two separate lists and two sets of trampoline stub handlers? @theRealAph , I've tried the above and it seems to work as intended. Please let me know if you find that solution more suitable and I'll prepare a patch for submission. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25954#issuecomment-3004996650 From mdoerr at openjdk.org Wed Jun 25 14:51:33 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 25 Jun 2025 14:51:33 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 09:56:52 GMT, Martin Doerr wrote: >> src/hotspot/share/gc/shared/barrierSetNMethod.hpp line 52: >> >>> 50: >>> 51: public: >>> 52: BarrierSetNMethod() : _current_phase(initial) {} >> >> @fisk: The initial value doesn't match our initialization in the nmethod entry barrier code were we use 0. That causes all new nmethods to run through the barrier code when they are called for the first time. I think that is unnecessary and it slows down the startup a bit. All oops should already be correct after the nmethod got installed. And for ZGC, we call `nmethod_patch_barriers` in `ZNMethod::register_nmethod`. So, I don't see any need to execute the barrier code. We could change the initialization to use `_current_phase`. Do you agree? Should I file a new JBS issue? > > In other words, I think `disarm(nm);` is missing for some GCs. ZGC has it in `ZNMethod::register_nmethod`. I've filed https://bugs.openjdk.org/browse/JDK-8360540. We don't need to discuss it here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25764#discussion_r2166924488 From mhaessig at openjdk.org Wed Jun 25 15:45:33 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 25 Jun 2025 15:45:33 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 10:41:04 GMT, Jatin Bhateja wrote: >> Intel@ AVX10 ISA [1] extensions added new floating point MIN/MAX instructions which comply with definitions in IEEE-754-2019 standard section 9.6 and can directly emulate Math.min/max semantics without the need for any special handling for NaN, +0.0 or -0.0 detection. >> >> **The following pseudo-code describes the existing algorithm for min/max[FD]:** >> >> Move the non-negative value to the second operand; this will ensure that we correctly handle 0.0 and -0.0 values, if values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. Existing MINPS and MAXPS semantics only check for NaN as the second operand; hence, we need special handling to check for NaN at the first operand. >> >> btmp = (b < +0.0) ? a : b >> atmp = (b < +0.0) ? b : a >> Tmp = Max_Float(atmp , btmp) >> Res = (atmp == NaN) ? atmp : Tmp >> >> For min[FD] we need a small tweak in the above algorithm, i.e., move the non-negative value to the first operand, this will ensure that we correctly select -0.0 if both the operands being compared are 0.0 or -0.0. >> >> btmp = (b < +0.0) ? b : a >> atmp = (b < +0.0) ? a : b >> Tmp = Max_Float(atmp , btmp) >> Res = (atmp == NaN) ? atmp : Tmp >> >> Thus, we need additional special handling for NaNs and +/-0.0 to compute floating-point min/max values to comply with the semantics of Math.max/min APIs using existing MINPS / MAXPS instructions. AVX10.2 added a new instruction, VPMINMAX[SH,SS,SD]/[PH,PS,PD], which comprehensively handles special cases, thereby eliminating the need for special handling. >> >> Patch emits new instructions for reduction and non-reduction operations for single, double, and Float16 type. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update comments Thank you for implementing these new instructions! I had a look at your changes and have a few minor suggestions and questions. I am quite new to this part of the codebase, so feel free to disagree if I am way off base. How did you test these changes? Also, if you merge the current master branch, the Windows build failures in the Github Actions will be fixed. src/hotspot/cpu/x86/assembler_x86.cpp line 8693: > 8691: } > 8692: > 8693: Suggestion: Nit: superfluous empty line src/hotspot/cpu/x86/assembler_x86.cpp line 8785: > 8783: void Assembler::evminmaxps(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int imm8, int vector_len) { > 8784: assert(VM_Version::supports_avx10_2(), ""); > 8785: InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); Suggestion: InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); Nit: missing space src/hotspot/cpu/x86/assembler_x86.hpp line 2752: > 2750: void eminmaxss(XMMRegister dst, XMMRegister nds, XMMRegister src, int imm8); > 2751: void eminmaxsd(XMMRegister dst, XMMRegister nds, XMMRegister src, int imm8); > 2752: void evminmaxph(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int imm8, int vector_len); Is there a reason `evminmaxph` does not have a version where `src` has type `Address`? src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1241: > 1239: } > 1240: > 1241: void C2_MacroAssembler::vminmax_fp(int opc, BasicType elem_bt, XMMRegister dst, KRegister mask, Line 1122 mentions the differences between `vminps/vmaxps` and Java semantics. Perhaps a mention of the new instructions introduced in this PR might help people who are confused about the fact that `vminmax_fp` is overloaded. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1246: > 1244: opc == Op_MaxV || opc == Op_MaxReductionV, "sanity"); > 1245: if (elem_bt == T_FLOAT) { > 1246: evminmaxps(dst, mask, src1, src2, true, opc == Op_MinV || opc == Op_MinReductionV ? 0x4 : 0x5, vlen_enc); Perhaps `0x4` and `0x5` should be factored into named constants since they are used in multiple places and it would also help readability if one does not have the documentation handy when reading the code. ------------- Changes requested by mhaessig (Committer). PR Review: https://git.openjdk.org/jdk/pull/25914#pullrequestreview-2958407187 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2166859511 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2166896645 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2167019420 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2167008970 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2166994321 From iveresov at openjdk.org Wed Jun 25 16:15:33 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 25 Jun 2025 16:15:33 GMT Subject: [jdk25] RFR: 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: Message-ID: <8JF1CEAXp6lliwGkTOXojtv3Rz3v1nYRL06MUj8OArk=.d8241342-6ca1-4d9b-82ab-8b1451bb76cd@github.com> On Wed, 25 Jun 2025 06:54:35 GMT, Igor Veresov wrote: > 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded Thanks, Alexey! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25969#issuecomment-3005356040 From iveresov at openjdk.org Wed Jun 25 16:15:34 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 25 Jun 2025 16:15:34 GMT Subject: [jdk25] Integrated: 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 06:54:35 GMT, Igor Veresov wrote: > 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded This pull request has now been integrated. Changeset: fdb3e37c Author: Igor Veresov URL: https://git.openjdk.org/jdk/commit/fdb3e37c714a5fd5aa78f9a5528a182c6e961485 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod 8359788: Internal Error: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded Reviewed-by: shade Backport-of: 5c4f92ba9a2b820fa12920400c9037b5d3c37aa4 ------------- PR: https://git.openjdk.org/jdk/pull/25969 From cjplummer at openjdk.org Wed Jun 25 17:31:47 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 25 Jun 2025 17:31:47 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v3] In-Reply-To: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: <0_4zYtZFNx5QA5h_4sQsF1dFV8Zr8dPZZHfKk-UuGRk=.acc9c10c-e720-4101-9e50-5a8edff6035b@github.com> > Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). > > I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). > > Testing (in progress): > > - [x] tier1 ci > - [x] tier1 ci with -XX:StartFlightRecording > - [x] tier5 ci Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: add missing space ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25960/files - new: https://git.openjdk.org/jdk/pull/25960/files/28a97a11..8de19276 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25960&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25960&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25960.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25960/head:pull/25960 PR: https://git.openjdk.org/jdk/pull/25960 From cjplummer at openjdk.org Wed Jun 25 17:31:47 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 25 Jun 2025 17:31:47 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v2] In-Reply-To: References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: On Wed, 25 Jun 2025 08:07:41 GMT, Kevin Walls wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> fix copyright > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line 200: > >> 198: " (expected type JavaThread, CompilerThread, MonitorDeflationThread, AttachListenerThread," + >> 199: " DeoptimizeObjectsALotThread, StringDedupThread, NotificationThread, ServiceThread," + >> 200: "JfrRecorderThread, or JvmtiAgentThread)", e); > > nitpicking: space? Good catch. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25960#discussion_r2167253978 From dcubed at openjdk.org Wed Jun 25 17:51:36 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 25 Jun 2025 17:51:36 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Wed, 18 Jun 2025 11:59:56 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/jmethodIDTable.hpp line 31: >> >>> 29: #include "memory/allocation.hpp" >>> 30: >>> 31: // Class for associating Method with jmethodID >> >> nit typo: please add an ending period to the comment. > > fixed. Hmmm... I'm still not seeing the ending period... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2167251132 From dcubed at openjdk.org Wed Jun 25 17:51:35 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 25 Jun 2025 17:51:35 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test I've made another pass through this PR. Just a few more nits. src/hotspot/share/oops/instanceKlass.cpp line 2481: > 2479: > 2480: // Make a jmethodID for all methods in this class. > 2481: // This makes getting all method ids much, much faster with classes with more than 8 Perhaps: // Make a jmethodID for all methods in this class. This makes getting // all method ids much, much faster with classes with more than 8 for a better looking flow. src/hotspot/share/oops/instanceKlass.cpp line 4279: > 4277: // This nulls out jmethodIDs for all obsolete methods in the previous version of the 'klass'. > 4278: // These obsolete methods only exist in the previous version and we're about to delete the memory for them. > 4279: // The jmethodID for these are deallocated when we unload the class, so this doesn't remove them from the table. nit typo: s/jmethodID/jmethodIDs/ src/hotspot/share/oops/jmethodIDTable.cpp line 190: > 188: // - multiple redefined versions may share jmethodID slots and if a method > 189: // has already been rewired to a newer version we could be clearing reference > 190: // to a still existing method instance. Perhaps: // has already been rewired to a newer version we could be clearing he // reference to a still existing method instance. src/hotspot/share/runtime/mutexLocker.cpp line 236: > 234: MUTEX_DEFN(Notification_lock , PaddedMonitor, service); // used for notification thread operations > 235: > 236: MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs locks HandshakeState_lock Perhaps: MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs, can lock HandshakeState_lock which is at nosafepoint ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2958999974 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2167214224 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2167231725 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2167249362 PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2167284891 From dcubed at openjdk.org Wed Jun 25 18:17:38 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 25 Jun 2025 18:17:38 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Another way to look at the safety of the new mechanism is to consider the use of jmethodIDs from the `AsyncGetCallTrace` API: src/hotspot/share/prims/forte.cpp: 108 2025.06.25 12:09:14 $ cat /tmp/fred // Forte Analyzer AsyncGetCallTrace() entry point. Currently supported // on Linux X86, Solaris SPARC and Solaris X86. // // Async-safe version of GetCallTrace being called from a signal handler // when a LWP gets interrupted by SIGPROF but the stack traces are filled // with different content (see below). // // This function must only be called when JVM/TI // CLASS_LOAD events have been enabled since agent startup. The enabled // event will cause the jmethodIDs to be allocated at class load time. // The jmethodIDs cannot be allocated in a signal handler because locks // cannot be grabbed in a signal handler safely. // // void (*AsyncGetCallTrace)(ASGCT_CallTrace *trace, jint depth, void* ucontext) // // Called by the profiler to obtain the current method call stack trace for // a given thread. The thread is identified by the env_id field in the // ASGCT_CallTrace structure. The profiler agent should allocate a ASGCT_CallTrace // structure with enough memory for the requested stack depth. The VM fills in // the frames buffer and the num_frames field. // // Arguments: // // trace - trace data structure to be filled by the VM. // depth - depth of the call stack trace. // ucontext - ucontext_t of the LWP // // ASGCT_CallTrace: // typedef struct { // JNIEnv *env_id; // jint num_frames; // ASGCT_CallFrame *frames; // } ASGCT_CallTrace; // // Fields: // env_id - ID of thread which executed this trace. // num_frames - number of frames in the trace. // (< 0 indicates the frame is not walkable). // frames - the ASGCT_CallFrames that make up this trace. Callee followed by callers. // // ASGCT_CallFrame: // typedef struct { // jint lineno; // jmethodID method_id; // } ASGCT_CallFrame; // // Fields: // 1) For Java frame (interpreted and compiled), // lineno - bci of the method being executed or -1 if bci is not available // method_id - jmethodID of the method being executed // 2) For native method // lineno - (-3) // method_id - jmethodID of the method being executed The `AsyncGetCallTrace` API returns jmethodIDs without explicitly associated jclass references. The `AsyncGetCallTrace` API relies on JVM/TI CLASS_LOAD events to have been enabled since agent startup for jmethodID creation. Implicit in that requirement is the agent keeping a jclass reference for each CLASS_LOAD event so that the class cannot be unloaded while the data returned by `AsyncGetCallTrace` is processed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3005716494 From kbarrett at openjdk.org Wed Jun 25 18:24:45 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 25 Jun 2025 18:24:45 GMT Subject: RFR: 8352565: Add native method implementation of Reference.get() [v11] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 07:09:39 GMT, Kim Barrett wrote: >> Please review this change which adds a native method providing the >> implementation of Reference::get. Referece::get is an intrinsic candidate, so >> this native method implementation is only used when the intrinsic is not. >> >> Currently there is intrinsic support by the interpreter, C1, C2, and graal, >> which are always used. With this change we can later remove all the >> per-platform interpreter intrinsic implementations, and might also remove the >> C1 intrinsic implementation. >> >> Testing: >> (1) mach5 tier1-6 normal (so using all the existing intrinsics). >> (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: > > - Merge branch 'master' into native-reference-get > - add pseudo-native entry for Reference.get0 > - tidy CallGenerator lookup in Compile ctor > - fix comment alignment > - Merge branch 'master' into native-reference-get > - make private native Reference.get0 the intrinsic > - Merge branch 'master' into native-reference-get > - Merge branch 'master' into native-reference-get > - use new waitForRefProc, some tidying > - Merge branch 'master' into native-reference-get > - ... and 7 more: https://git.openjdk.org/jdk/compare/61c1c30d...877e64ca Thanks for all the reviews and discussions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24315#issuecomment-3005731224 From kbarrett at openjdk.org Wed Jun 25 18:24:46 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 25 Jun 2025 18:24:46 GMT Subject: Integrated: 8352565: Add native method implementation of Reference.get() In-Reply-To: References: Message-ID: On Sat, 29 Mar 2025 21:47:18 GMT, Kim Barrett wrote: > Please review this change which adds a native method providing the > implementation of Reference::get. Referece::get is an intrinsic candidate, so > this native method implementation is only used when the intrinsic is not. > > Currently there is intrinsic support by the interpreter, C1, C2, and graal, > which are always used. With this change we can later remove all the > per-platform interpreter intrinsic implementations, and might also remove the > C1 intrinsic implementation. > > Testing: > (1) mach5 tier1-6 normal (so using all the existing intrinsics). > (2) mach5 tier1-6 with interpreter and C1 Reference::get intrinsics disabled. This pull request has now been integrated. Changeset: 56c75453 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/56c75453cd69e80b9411b4e1794c953998406342 Stats: 247 lines in 20 files changed: 207 ins; 11 del; 29 mod 8352565: Add native method implementation of Reference.get() Reviewed-by: vlivanov, tschatzl, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/24315 From dcubed at openjdk.org Wed Jun 25 18:25:33 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 25 Jun 2025 18:25:33 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Fri, 20 Jun 2025 20:41:21 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix the test Ping @jbachorik for discussions about the safety of jmethodIDs... ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3005733626 From lmesnik at openjdk.org Wed Jun 25 18:29:36 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 25 Jun 2025 18:29:36 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v6] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:26:22 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Removed vmRTMCPU from VMProps.java Thanks for addressing feedback. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2959215575 From fbredberg at openjdk.org Wed Jun 25 19:00:38 2025 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 25 Jun 2025 19:00:38 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v6] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:26:22 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Removed vmRTMCPU from VMProps.java Great work, and thanks for bringing this over the finishing line. ------------- Marked as reviewed by fbredberg (Committer). PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2959309747 From kevinw at openjdk.org Wed Jun 25 19:03:28 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 19:03:28 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v3] In-Reply-To: <0_4zYtZFNx5QA5h_4sQsF1dFV8Zr8dPZZHfKk-UuGRk=.acc9c10c-e720-4101-9e50-5a8edff6035b@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> <0_4zYtZFNx5QA5h_4sQsF1dFV8Zr8dPZZHfKk-UuGRk=.acc9c10c-e720-4101-9e50-5a8edff6035b@github.com> Message-ID: On Wed, 25 Jun 2025 17:31:47 GMT, Chris Plummer wrote: >> Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). >> >> I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). >> >> Testing (in progress): >> >> - [x] tier1 ci >> - [x] tier1 ci with -XX:StartFlightRecording >> - [x] tier5 ci > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > add missing space Marked as reviewed by kevinw (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25960#pullrequestreview-2959316852 From dlong at openjdk.org Wed Jun 25 19:28:35 2025 From: dlong at openjdk.org (Dean Long) Date: Wed, 25 Jun 2025 19:28:35 GMT Subject: Integrated: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead In-Reply-To: References: Message-ID: On Thu, 12 Jun 2025 01:51:09 GMT, Dean Long wrote: > This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. > > We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. > > The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. > > For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. > > This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. This pull request has now been integrated. Changeset: cf75f1f9 Author: Dean Long URL: https://git.openjdk.org/jdk/commit/cf75f1f9c6d2bc70c7133cb81c73a0ce0946dff9 Stats: 603 lines in 43 files changed: 97 ins; 459 del; 47 mod 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead Co-authored-by: Martin Doerr Co-authored-by: Amit Kumar Reviewed-by: mdoerr, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/25764 From amenkov at openjdk.org Wed Jun 25 20:21:29 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 25 Jun 2025 20:21:29 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Wed, 25 Jun 2025 13:02:03 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - comment update > - comment update Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25958#pullrequestreview-2959538506 From coleenp at openjdk.org Wed Jun 25 20:32:32 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 25 Jun 2025 20:32:32 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v11] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Tue, 24 Jun 2025 04:50:02 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @coleenp and @dholmes-ora comments Thanks for removing the big comment. src/hotspot/share/utilities/exceptions.cpp line 34: > 32: #include "memory/resourceArea.hpp" > 33: #include "memory/universe.hpp" > 34: #include "oops/access.inline.hpp" This is not needed anymore either (?) src/hotspot/share/utilities/exceptions.hpp line 120: > 118: > 119: // Logging > 120: static void maybe_log_call_stack(Handle exception, bool omit_if_same); I think this is not needed anymore. ------------- PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2959566568 PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2167572522 PR Review Comment: https://git.openjdk.org/jdk/pull/25522#discussion_r2167571739 From dholmes at openjdk.org Wed Jun 25 20:51:30 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 20:51:30 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> On Wed, 25 Jun 2025 13:02:03 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - comment update > - comment update Something still bugging me about this one. From JBS it looked to me like we were dealing with a virtual thread but your change is for the non-virtual thread. And Alan says something about this only being possible due to a temporary condition. So I'm still unclear exactly what the problem is, or why it appeared. Where does the initial "thread" argument come from in the Java code? Is it the one that has terminated, if so why is there not an `isAlive()` check somewhere? And how does this lead to the bad oop? src/hotspot/share/services/threadService.cpp line 1477: > 1475: java_thread = java_lang_Thread::thread(thread_h()); > 1476: if (java_thread == nullptr) { > 1477: return nullptr; // thread terminated If you return here what does that mean for the null check at line 1483? Is that code now dead? ------------- PR Review: https://git.openjdk.org/jdk/pull/25958#pullrequestreview-2959619173 PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2167605967 From iklam at openjdk.org Wed Jun 25 20:54:48 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 25 Jun 2025 20:54:48 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v12] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.042s][info][exceptions] Exception > [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) > [0.042s][info][exceptions,stacktrace] Exception > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) > [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 > Exception 1 caught. > > > - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. > > - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: > - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` > - By native code in Exceptions::special_exception() and and Exceptions::_throw()). > > **Concurrent Exceptions** > > Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @coleenp comments: Removed dead code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/3055ddbb..6b57611d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=10-11 Stats: 4 lines in 2 files changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From kevinw at openjdk.org Wed Jun 25 21:04:29 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 21:04:29 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> Message-ID: On Wed, 25 Jun 2025 20:48:17 GMT, David Holmes wrote: >> Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: >> >> - comment update >> - comment update > > src/hotspot/share/services/threadService.cpp line 1477: > >> 1475: java_thread = java_lang_Thread::thread(thread_h()); >> 1476: if (java_thread == nullptr) { >> 1477: return nullptr; // thread terminated > > If you return here what does that mean for the null check at line 1483? Is that code now dead? Here, we have this extra null check when is_virtual is false. If not is_virtual, we really need a java_thread, or give up. Down at 1483 we might have have is_virtual true, and we also may have found a java_thread, so I think we need both checks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2167627054 From kevinw at openjdk.org Wed Jun 25 21:26:29 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 25 Jun 2025 21:26:29 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> Message-ID: On Wed, 25 Jun 2025 20:48:26 GMT, David Holmes wrote: > Something still bugging me about this one. From JBS it looked to me like we were dealing with a virtual thread but your change is for the non-virtual thread. And Alan says something about this only being possible due to a temporary condition. So I'm still unclear exactly what the problem is, or why it appeared. Where does the initial "thread" argument come from in the Java code? Is it the one that has terminated, if so why is there not an `isAlive()` check somewhere? > > And how does this lead to the bad oop? Yes, I was reproducing with a regular non-virtual thread exiting. We have the the j.l.Thread Object and could for it being TERMINATED earlier in HeapDumper/Snapshot, but leaving it to the last moment avoids a bigger window where it could terminate. (Maybe there is somewhere this should intersect with ThreadSMR...?) On the bad oop: I enabled the test to run in debug vm for my own testing, but in one of the earlier release crashes at: V [libjvm.so+0x47bb10] AccessInternal::PostRuntimeDispatch, (AccessInternal::BarrierType)3, 286822ul>::oop_access_barrier(oopDesc*, long)+0x0 (accessBackend.hpp:228) V [libjvm.so+0x10e1c1a] vframeStream::vframeStream(oopDesc*, Handle)+0x7a (vframe.cpp:523) V [libjvm.so+0x1068a51] GetThreadSnapshotClosure::do_thread(Thread*)+0x7d1 (threadService.cpp:1319) V [libjvm.so+0x106691d] ThreadSnapshotFactory::get_thread_snapshot(_jobject*, JavaThread*)+0x80d (threadService.cpp:1482) V [libjvm.so+0xae23d5] JVM_CreateThreadSnapshot+0x75 (jvm.cpp:2966) j jdk.internal.vm.ThreadSnapshot.create(Ljava/lang/Thread;)Ljdk/internal/vm/ThreadSnapshot;+0 java.base at 25-ea ... Line number info puts it in the _java_thread == null branch of: threadService.cpp 1317 vframeStream vfst(_java_thread != nullptr 1318 ? vframeStream(_java_thread, false, true, vthread_carrier) 1319 : vframeStream(java_lang_VirtualThread::continuation(_thread_h()))); <--- And it's looking inside the Handle _thread_h() within GetThreadSnapshotClosure which was setup by get_thread_snapshot, and it's a null pointer, as Instructions: =>0x00007ffadc251b10: 8b 14 37 31 c0 85 d2 74 18 89 d0 48 8d 15 1e ee mov edx,DWORD PTR [rdi+rsi*1] and RDI=0x0000000000000000 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25958#issuecomment-3006190183 From dcubed at openjdk.org Wed Jun 25 22:13:29 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 25 Jun 2025 22:13:29 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: <6sxeRs0jaGtjCoxcJLBksWiyacOvRkSn40GXcLNKEos=.e4736687-f826-4a11-977e-9b0cf765e046@github.com> On Wed, 25 Jun 2025 13:02:03 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - comment update > - comment update Changes requested by dcubed (Reviewer). src/hotspot/share/services/threadService.cpp line 1477: > 1475: java_thread = java_lang_Thread::thread(thread_h()); > 1476: if (java_thread == nullptr) { > 1477: return nullptr; // thread terminated This is not the right way to determine if you have a valid JavaThread when you have created a ThreadsListHandle. This code near the top of `ThreadSnapshotFactory::get_thread_snapshot` is not right: ThreadsListHandle tlh(THREAD); ResourceMark rm(THREAD); HandleMark hm(THREAD); Handle thread_h(THREAD, JNIHandles::resolve(jthread)); The above code was added by: [JDK-8357650](https://bugs.openjdk.org/browse/JDK-8357650) ThreadSnapshot to take snapshot of thread for thread dumps Here's the example code from src/hotspot/share/runtime/threadSMR.hpp: // JNI jobject example: // jobject jthread = ...; // : // ThreadsListHandle tlh; // JavaThread* jt = nullptr; // bool is_alive = tlh.cv_internal_thread_to_JavaThread(jthread, &jt, nullptr); // if (is_alive) { // : // do stuff with 'jt'... // } So instead of this line: Handle thread_h(THREAD, JNIHandles::resolve(jthread)); which does not guarantee you a valid JavaThread handle, you should use `tlh.cv_internal_thread_to_JavaThread` to get a `JavaThread*`. ------------- PR Review: https://git.openjdk.org/jdk/pull/25958#pullrequestreview-2959809320 PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2167723492 From iklam at openjdk.org Wed Jun 25 22:41:53 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 25 Jun 2025 22:41:53 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v13] In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.042s][info][exceptions] Exception > [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) > [0.042s][info][exceptions,stacktrace] Exception > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) > [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 > Exception 1 caught. > > > - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. > > - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: > - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` > - By native code in Exceptions::special_exception() and and Exceptions::_throw()). > > **Concurrent Exceptions** > > Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - Merge branch 'master' into 8358080-print-thread-stack-with-xlog-exceptions-trace - Fix crash in runtime/logging/RedefineClasses.java: cannot print stack trace in Exceptions::special_exception() - @coleenp comments: Removed dead code - @coleenp and @dholmes-ora comments - @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files - refactor - Reimplement -- print stack trace only when it is a known throwing site - @dholmes-ora comments -- removed printing of output.getStdout() from test - Print callstack for rethrown exceptions - @dholmes-ora comments - use JavaThread::current() instead - ... and 5 more: https://git.openjdk.org/jdk/compare/963ea0c1...3eb7c622 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25522/files - new: https://git.openjdk.org/jdk/pull/25522/files/6b57611d..3eb7c622 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25522&range=11-12 Stats: 82332 lines in 1637 files changed: 44217 ins; 24426 del; 13689 mod Patch: https://git.openjdk.org/jdk/pull/25522.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25522/head:pull/25522 PR: https://git.openjdk.org/jdk/pull/25522 From dholmes at openjdk.org Wed Jun 25 23:53:28 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 25 Jun 2025 23:53:28 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: <6sxeRs0jaGtjCoxcJLBksWiyacOvRkSn40GXcLNKEos=.e4736687-f826-4a11-977e-9b0cf765e046@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <6sxeRs0jaGtjCoxcJLBksWiyacOvRkSn40GXcLNKEos=.e4736687-f826-4a11-977e-9b0cf765e046@github.com> Message-ID: On Wed, 25 Jun 2025 22:08:17 GMT, Daniel D. Daugherty wrote: >> Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: >> >> - comment update >> - comment update > > src/hotspot/share/services/threadService.cpp line 1477: > >> 1475: java_thread = java_lang_Thread::thread(thread_h()); >> 1476: if (java_thread == nullptr) { >> 1477: return nullptr; // thread terminated > > This is not the right way to determine if you have a valid JavaThread > when you have created a ThreadsListHandle. This code near the top > of `ThreadSnapshotFactory::get_thread_snapshot` is not right: > > > ThreadsListHandle tlh(THREAD); > ResourceMark rm(THREAD); > HandleMark hm(THREAD); > Handle thread_h(THREAD, JNIHandles::resolve(jthread)); > > > The above code was added by: > [JDK-8357650](https://bugs.openjdk.org/browse/JDK-8357650) ThreadSnapshot to take snapshot of thread for thread dumps > > Here's the example code from src/hotspot/share/runtime/threadSMR.hpp: > > // JNI jobject example: > // jobject jthread = ...; > // : > // ThreadsListHandle tlh; > // JavaThread* jt = nullptr; > // bool is_alive = tlh.cv_internal_thread_to_JavaThread(jthread, &jt, nullptr); > // if (is_alive) { > // : // do stuff with 'jt'... > // } > > > So instead of this line: > > Handle thread_h(THREAD, JNIHandles::resolve(jthread)); > > which does not guarantee you a valid JavaThread handle, you should > use `tlh.cv_internal_thread_to_JavaThread` to get a `JavaThread*`. Great catch Dan! I totally missed the TLH at the start of `get_thread_snapshot`. I knew something was off here but couldn't quite put my finger on it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2167822932 From dholmes at openjdk.org Thu Jun 26 00:40:28 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 26 Jun 2025 00:40:28 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> Message-ID: On Wed, 25 Jun 2025 21:24:13 GMT, Kevin Walls wrote: > Line number info puts it in the _java_thread == null branch of: threadService.cpp > 1317 vframeStream vfst(_java_thread != nullptr > 1318 ? vframeStream(_java_thread, false, true, vthread_carrier) > 1319 : vframeStream(java_lang_VirtualThread::continuation(_thread_h()))); <--- > > And it's looking inside the Handle _thread_h() within GetThreadSnapshotClosure which was setup by get_thread_snapshot, and it's a null pointer, But `_thread_h()` has already been used a number of times before we get here and if it were null we should have crashed long ago. ??? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25958#issuecomment-3006621489 From dholmes at openjdk.org Thu Jun 26 00:59:42 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 26 Jun 2025 00:59:42 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v13] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: <0g8bgJTlTjVQuQVGWnNODo93WfuRJUVqjID6FpEybd0=.4ecc22fa-8b6c-46a7-a107-1be646e12d22@github.com> On Wed, 25 Jun 2025 22:41:53 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - Merge branch 'master' into 8358080-print-thread-stack-with-xlog-exceptions-trace > - Fix crash in runtime/logging/RedefineClasses.java: cannot print stack trace in Exceptions::special_exception() > - @coleenp comments: Removed dead code > - @coleenp and @dholmes-ora comments > - @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files > - refactor > - Reimplement -- print stack trace only when it is a known throwing site > - @dholmes-ora comments -- removed printing of output.getStdout() from test > - Print callstack for rethrown exceptions > - @dholmes-ora comments - use JavaThread::current() instead > - ... and 5 more: https://git.openjdk.org/jdk/compare/ff922b51...3eb7c622 Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2960063811 From qxing at openjdk.org Thu Jun 26 01:58:32 2025 From: qxing at openjdk.org (Qizheng Xing) Date: Thu, 26 Jun 2025 01:58:32 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:10:51 GMT, Coleen Phillimore wrote: >> Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove extra trailing new line > > src/hotspot/share/utilities/packedTable.hpp line 27: > >> 25: #ifndef SHARE_UTILITIES_PACKEDTABLE_HPP >> 26: #define SHARE_UTILITIES_PACKEDTABLE_HPP >> 27: > > I just backported this to JDK 25. I wonder if you should omit this change and put it under another CR so we can backport that too. Or not backport this, assuming that JDK 25 will never have double inclusions of this file ? Can I backport the entire PR to JDK 25 instead of splitting it up? I noticed that all the changes can be applied to JDK 25 as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25968#discussion_r2167937010 From kbarrett at openjdk.org Thu Jun 26 02:14:16 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 26 Jun 2025 02:14:16 GMT Subject: RFR: 8314488: Compiling the JDK with C++17 Message-ID: I'm hijacking the PR mechanism as a way to discuss new C++17 features that can be more easily structured and captured than bare email. Once discussion settles down I'll turn the results into HotSpot Style Guide changes. I don't intend to integrate any version of this document to the OpenJDK repository. ------------- Commit messages: - initial draft Changes: https://git.openjdk.org/jdk/pull/25992/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25992&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8314488 Stats: 1113 lines in 1 file changed: 1113 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25992.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25992/head:pull/25992 PR: https://git.openjdk.org/jdk/pull/25992 From dholmes at openjdk.org Thu Jun 26 02:38:29 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 26 Jun 2025 02:38:29 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:19:47 GMT, Joel Sikstr?m wrote: >> Hello, >> >> PROPERFMT is defined as the format string "%zu%s", which expects a size_t as input argument. When used in combination with PROPERFMTARGS, which uses the templated byte_size_in_proper_units, the byte size may not be size_t if the input is some other type. >> >> To minimize confusion, PROPERFMTARGS should always use the size_t template specilization of byte_size_in_proper_units, to match PROPERFMT. Places that use byte_size_in_proper_units with other types can still use it, but should use their own format strings instead of PROPERFMT. >> >> ProcSmapsSummary::print_on in memMapPrinter_macosx is the only place that uses PROPERFMTARGS with a type that is not size_t. I have changed those places to use the expanded version of the macro, which uses the templated version of byte_size_in_proper_unit instead. >> >> Testing: >> * Currently running Oracle's tier1-2 > > Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into JDK-8360515_properfmtargs > - 8360515: PROPERFMTARGS should always use size_t template specialization for unit src/hotspot/os/bsd/memMapPrinter_macosx.cpp line 239: > 237: st->print_cr(" rss: %llu (%llu%s)", vm_info.resident_size, byte_size_in_proper_unit(vm_info.resident_size), proper_unit_for_byte_size(vm_info.resident_size)); > 238: st->print_cr(" peak rss: %llu (%llu%s)", vm_info.resident_size_peak, byte_size_in_proper_unit(vm_info.resident_size_peak), proper_unit_for_byte_size(vm_info.resident_size_peak)); > 239: st->print_cr(" page size: %d (" PROPERFMT ")", vm_info.page_size, PROPERFMTARGS((size_t)vm_info.page_size)); Just to clarify something for the reader here, as it tripped me up, the `vm_info` fields are declared as `mach_vm_size_t`, which one might expect is some kind of `size_t` but alas no [1]: typedef uint64_t mach_vm_size_t; But given you cast `page_size` to size_t (it is `int32_t`) why not cast the others too and use `PROPERFMT`? [1] https://developer.apple.com/documentation/kernel/mach_vm_size_t ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25975#discussion_r2167979715 From sspitsyn at openjdk.org Thu Jun 26 05:06:29 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 26 Jun 2025 05:06:29 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v3] In-Reply-To: <0_4zYtZFNx5QA5h_4sQsF1dFV8Zr8dPZZHfKk-UuGRk=.acc9c10c-e720-4101-9e50-5a8edff6035b@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> <0_4zYtZFNx5QA5h_4sQsF1dFV8Zr8dPZZHfKk-UuGRk=.acc9c10c-e720-4101-9e50-5a8edff6035b@github.com> Message-ID: On Wed, 25 Jun 2025 17:31:47 GMT, Chris Plummer wrote: >> Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). >> >> I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). >> >> Testing (in progress): >> >> - [x] tier1 ci >> - [x] tier1 ci with -XX:StartFlightRecording >> - [x] tier5 ci > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > add missing space Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25960#pullrequestreview-2960559224 From alanb at openjdk.org Thu Jun 26 05:59:35 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 26 Jun 2025 05:59:35 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v6] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:26:22 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Removed vmRTMCPU from VMProps.java Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25847#pullrequestreview-2960685033 From alanb at openjdk.org Thu Jun 26 05:59:36 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 26 Jun 2025 05:59:36 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v5] In-Reply-To: References: Message-ID: <-uNZ-9Lyqc6JzRJNBqSqw6KbARrJowx8a6MDMT-OIZc=.5cc39187-9b92-492d-a949-d7d33a8e25f9@github.com> On Wed, 25 Jun 2025 11:35:51 GMT, Anton Artemov wrote: >> test/jdk/jdk/internal/vm/Continuation/Fuzz.java line 477: >> >>> 475: boolean shouldPin() { >>> 476: // Returns false since we never pin after we removed legacy locking. >>> 477: return traceHas(Op.PIN::contains) && false; >> >> Are you planning to remove this method and update verifyPin, or maybe there will be a follow-on JBS issue for this cleanup? > > Removal will be done in phase 2. Next phase is okay too, just need to remember as it will be confusing for a time to have it return false unconditionally. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25847#discussion_r2168188065 From alanb at openjdk.org Thu Jun 26 06:06:29 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 26 Jun 2025 06:06:29 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: <4hLF410lF9EFIriy-NuDlj97eLzkM2hclxro5Hf6xlo=.b96e68e9-83d6-4823-bac4-c3d944bc67dc@github.com> On Wed, 25 Jun 2025 13:02:03 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - comment update > - comment update test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWithEliminatedLock.java line 31: > 29: * @requires !vm.debug & (vm.compMode != "Xcomp") > 30: * @requires (vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel == 4) > 31: * @requires vm.jvmti This seems like a separate discussion as the minimal VM doesn't have the M&M support so something isn't right if somehow is testing a run-time image that contains jdk.management and only the minimal VM. I assume it's just a drive-by change here but I think part of a bigger discussion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2168202465 From duke at openjdk.org Thu Jun 26 07:29:37 2025 From: duke at openjdk.org (duke) Date: Thu, 26 Jun 2025 07:29:37 GMT Subject: RFR: 8359437: Make users and test suite not able to set LockingMode flag [v6] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 11:26:22 GMT, Anton Artemov wrote: >> This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. >> >> The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. >> >> In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). >> >> Lightweight locking is the default locking from now on. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: > > 8359437: Removed vmRTMCPU from VMProps.java @toxaart Your change (at version 6534afaaeb8777fcb068710f6488efc35ff55af5) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25847#issuecomment-3007469560 From duke at openjdk.org Thu Jun 26 07:44:42 2025 From: duke at openjdk.org (Anton Artemov) Date: Thu, 26 Jun 2025 07:44:42 GMT Subject: Integrated: 8359437: Make users and test suite not able to set LockingMode flag In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 08:39:49 GMT, Anton Artemov wrote: > This PR contains changes for the 1st phase of the `LockingMode` flag obsoletion. > > The work is done by @fbredber, I have taken it over and am finishing it while he's on vacation. > > In the 1st phase one keeps the `LockingMode` variable in all places, but makes it non-settable from the command line. All the C1 and C2 code related to legacy locking will still be in place (but as dead code) and removed later (phase 2). > > Lightweight locking is the default locking from now on. > > Tested in tiers 1 - 7. This pull request has now been integrated. Changeset: 5039b42d Author: Anton Artemov Committer: David Holmes URL: https://git.openjdk.org/jdk/commit/5039b42de170769797312969185ee9d67f34cf24 Stats: 1154 lines in 34 files changed: 24 ins; 1044 del; 86 mod 8359437: Make users and test suite not able to set LockingMode flag 8358542: Remove RTM test VMProps Co-authored-by: Fredrik Bredberg Reviewed-by: coleenp, lmesnik, fbredberg, alanb, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25847 From jsikstro at openjdk.org Thu Jun 26 08:19:28 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 26 Jun 2025 08:19:28 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit [v2] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 02:35:49 GMT, David Holmes wrote: >> Joel Sikstr?m has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8360515_properfmtargs >> - 8360515: PROPERFMTARGS should always use size_t template specialization for unit > > src/hotspot/os/bsd/memMapPrinter_macosx.cpp line 239: > >> 237: st->print_cr(" rss: %llu (%llu%s)", vm_info.resident_size, byte_size_in_proper_unit(vm_info.resident_size), proper_unit_for_byte_size(vm_info.resident_size)); >> 238: st->print_cr(" peak rss: %llu (%llu%s)", vm_info.resident_size_peak, byte_size_in_proper_unit(vm_info.resident_size_peak), proper_unit_for_byte_size(vm_info.resident_size_peak)); >> 239: st->print_cr(" page size: %d (" PROPERFMT ")", vm_info.page_size, PROPERFMTARGS((size_t)vm_info.page_size)); > > Just to clarify something for the reader here, as it tripped me up, the `vm_info` fields are declared as `mach_vm_size_t`, which one might expect is some kind of `size_t` but alas no [1]: > > typedef uint64_t mach_vm_size_t; > > But given you cast `page_size` to size_t (it is `int32_t`) why not cast the others too and use `PROPERFMT`? > > [1] https://developer.apple.com/documentation/kernel/mach_vm_size_t I originally wanted this patch to be minimally invasive and just fix the mismatch between PROPERFMT and PROPERFMTARGS. The cast to size_t wasn't added by me, I just changed the format specifier to match the type in PROPERFMTARGS. I agree that it looks a bit weird to have both the expanded macro and PROPERFMTARGS next to each other, and I'm not 100% sure why vm_info.page_size is casted to size_t. I'm fine with making the usage consistent in this patch, making all the prints use PROPERFMT+PROPERFMTARGS with casts to size_t, or the other way around using the expanded macro. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25975#discussion_r2168479074 From aph at openjdk.org Thu Jun 26 08:20:27 2025 From: aph at openjdk.org (Andrew Haley) Date: Thu, 26 Jun 2025 08:20:27 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method In-Reply-To: References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: <_PIZLj4M2dZ3LH1VANe43lZPgExy7gabgW8llHpEcQA=.04b3eda8-05c4-4daa-b8c3-1175fa4cfbad@github.com> On Wed, 25 Jun 2025 12:48:31 GMT, Mikhail Ablakatov wrote: > We could use something like `Pair` for keys instead and use the `first` to distinguish between runtime and static calls. I'd need to extend `template<...> class Pair` so it properly supports hashing and comparison (at least equality) for this to work. Sounds good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25954#issuecomment-3007613728 From kevinw at openjdk.org Thu Jun 26 08:29:29 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 26 Jun 2025 08:29:29 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <6sxeRs0jaGtjCoxcJLBksWiyacOvRkSn40GXcLNKEos=.e4736687-f826-4a11-977e-9b0cf765e046@github.com> Message-ID: On Wed, 25 Jun 2025 23:50:59 GMT, David Holmes wrote: >> src/hotspot/share/services/threadService.cpp line 1477: >> >>> 1475: java_thread = java_lang_Thread::thread(thread_h()); >>> 1476: if (java_thread == nullptr) { >>> 1477: return nullptr; // thread terminated >> >> This is not the right way to determine if you have a valid JavaThread >> when you have created a ThreadsListHandle. This code near the top >> of `ThreadSnapshotFactory::get_thread_snapshot` is not right: >> >> >> ThreadsListHandle tlh(THREAD); >> ResourceMark rm(THREAD); >> HandleMark hm(THREAD); >> Handle thread_h(THREAD, JNIHandles::resolve(jthread)); >> >> >> The above code was added by: >> [JDK-8357650](https://bugs.openjdk.org/browse/JDK-8357650) ThreadSnapshot to take snapshot of thread for thread dumps >> >> Here's the example code from src/hotspot/share/runtime/threadSMR.hpp: >> >> // JNI jobject example: >> // jobject jthread = ...; >> // : >> // ThreadsListHandle tlh; >> // JavaThread* jt = nullptr; >> // bool is_alive = tlh.cv_internal_thread_to_JavaThread(jthread, &jt, nullptr); >> // if (is_alive) { >> // : // do stuff with 'jt'... >> // } >> >> >> So instead of this line: >> >> Handle thread_h(THREAD, JNIHandles::resolve(jthread)); >> >> which does not guarantee you a valid JavaThread handle, you should >> use `tlh.cv_internal_thread_to_JavaThread` to get a `JavaThread*`. > > Great catch Dan! I totally missed the TLH at the start of `get_thread_snapshot`. I knew something was off here but couldn't quite put my finger on it. Yes thanks Dan! Will update. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2168498780 From jbhateja at openjdk.org Thu Jun 26 08:47:12 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 26 Jun 2025 08:47:12 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v3] In-Reply-To: References: Message-ID: > Intel@ AVX10 ISA [1] extensions added new floating point MIN/MAX instructions which comply with definitions in IEEE-754-2019 standard section 9.6 and can directly emulate Math.min/max semantics without the need for any special handling for NaN, +0.0 or -0.0 detection. > > **The following pseudo-code describes the existing algorithm for min/max[FD]:** > > Move the non-negative value to the second operand; this will ensure that we correctly handle 0.0 and -0.0 values, if values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. Existing MINPS and MAXPS semantics only check for NaN as the second operand; hence, we need special handling to check for NaN at the first operand. > > btmp = (b < +0.0) ? a : b > atmp = (b < +0.0) ? b : a > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > For min[FD] we need a small tweak in the above algorithm, i.e., move the non-negative value to the first operand, this will ensure that we correctly select -0.0 if both the operands being compared are 0.0 or -0.0. > > btmp = (b < +0.0) ? b : a > atmp = (b < +0.0) ? a : b > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > Thus, we need additional special handling for NaNs and +/-0.0 to compute floating-point min/max values to comply with the semantics of Math.max/min APIs using existing MINPS / MAXPS instructions. AVX10.2 added a new instruction, VPMINMAX[SH,SS,SD]/[PH,PS,PD], which comprehensively handles special cases, thereby eliminating the need for special handling. > > Patch emits new instructions for reduction and non-reduction operations for single, double, and Float16 type. > > Kindly review and share your feedback. > > Best Regards, > Jatin > > [1] https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Review comments resolutions - Merge branch 'master' of https://github.com/openjdk/jdk into JDK-8360116 - Update comments - Extending the patch to cover reduction operations - 8360116: Add support for AVX10 floating point minmax instruction ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25914/files - new: https://git.openjdk.org/jdk/pull/25914/files/b6e55157..382c9b9e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=01-02 Stats: 6650 lines in 365 files changed: 3468 ins; 1485 del; 1697 mod Patch: https://git.openjdk.org/jdk/pull/25914.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25914/head:pull/25914 PR: https://git.openjdk.org/jdk/pull/25914 From jbhateja at openjdk.org Thu Jun 26 08:47:13 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 26 Jun 2025 08:47:13 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v2] In-Reply-To: References: Message-ID: <-BcDsdCnWIW95ESaZ5UIRIFDVOqEy7vDTW4e5xWfTe8=.42c9fe64-2cb3-46d7-99a2-25ae08239f17@github.com> On Wed, 25 Jun 2025 15:31:46 GMT, Manuel H?ssig wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Update comments > > src/hotspot/cpu/x86/assembler_x86.hpp line 2752: > >> 2750: void eminmaxss(XMMRegister dst, XMMRegister nds, XMMRegister src, int imm8); >> 2751: void eminmaxsd(XMMRegister dst, XMMRegister nds, XMMRegister src, int imm8); >> 2752: void evminmaxph(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int imm8, int vector_len); > > Is there a reason `evminmaxph` does not have a version where `src` has type `Address`? Currently, we do not have a matcher pattern to consume it, as the MIN/MAX sequence was anyway, a bulky one. I have added a new pattern for memory operand flavor of the pattern specifically for AVX-10, along with this patch. Patch has been regressed over the following tests using Intel SDE https://www.intel.com/content/www/us/en/download/684897/intel-software-development-emulator.html (Version 9.53). - test/jdk/jdk/incubator/vector/Double*VectorTests:: (min/max all variants including reduction) - test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java - test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java e.g. command line /home/jatinbha/softwares/sde-external-9.53.0-2025-03-16-lin/sde64 -future -ptr_raise -icount -- java > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1241: > >> 1239: } >> 1240: >> 1241: void C2_MacroAssembler::vminmax_fp(int opc, BasicType elem_bt, XMMRegister dst, KRegister mask, > > Line 1122 mentions the differences between `vminps/vmaxps` and Java semantics. Perhaps a mention of the new instructions introduced in this PR might help people who are confused about the fact that `vminmax_fp` is overloaded. Details on insturction semantics can be found in section 11.2 of AVX10 manual https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1246: > >> 1244: opc == Op_MaxV || opc == Op_MaxReductionV, "sanity"); >> 1245: if (elem_bt == T_FLOAT) { >> 1246: evminmaxps(dst, mask, src1, src2, true, opc == Op_MinV || opc == Op_MinReductionV ? 0x4 : 0x5, vlen_enc); > > Perhaps `0x4` and `0x5` should be factored into named constants since they are used in multiple places and it would also help readability if one does not have the documentation handy when reading the code. Hi @mhaessig , Command bits are in accordance with Tables 11.1 and 11.2 of section 11.2. First 2 bits [1:0] signify the operation kind, 00 for min and 01 for max. Next two bits [3:2] signify the sign selection logic and 4th bit 0 for both min/max, with this command word we can emulate the semantics of Math.max/min using a single AVX10 instruciton. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2168533731 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2168533872 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2168533554 From mhaessig at openjdk.org Thu Jun 26 09:09:36 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 26 Jun 2025 09:09:36 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v2] In-Reply-To: <-BcDsdCnWIW95ESaZ5UIRIFDVOqEy7vDTW4e5xWfTe8=.42c9fe64-2cb3-46d7-99a2-25ae08239f17@github.com> References: <-BcDsdCnWIW95ESaZ5UIRIFDVOqEy7vDTW4e5xWfTe8=.42c9fe64-2cb3-46d7-99a2-25ae08239f17@github.com> Message-ID: <8PZvUMxqZPm_wCGlcNEJbVTzueBZynS0mHLbpROOMDg=.855b2b72-2106-46b5-a4b7-79b0d77c1d6c@github.com> On Thu, 26 Jun 2025 08:43:27 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1246: >> >>> 1244: opc == Op_MaxV || opc == Op_MaxReductionV, "sanity"); >>> 1245: if (elem_bt == T_FLOAT) { >>> 1246: evminmaxps(dst, mask, src1, src2, true, opc == Op_MinV || opc == Op_MinReductionV ? 0x4 : 0x5, vlen_enc); >> >> Perhaps `0x4` and `0x5` should be factored into named constants since they are used in multiple places and it would also help readability if one does not have the documentation handy when reading the code. > > Hi @mhaessig , > Command bits are in accordance with Tables 11.1 and 11.2 of section 11.2. First 2 bits [1:0] signify the operation kind, 00 for min and 01 for max. Next two bits [3:2] signify the sign selection logic and 4th bit 0 for both min/max, with this command word we can emulate the semantics of Math.max/min using a single AVX10 instruciton. I got that from the documentation you kindly linked in the description. My question was rather to define a constant like `AVX10_MINMAX_MAX_COMPARE_SIGN = 0x5` that can be used instead of the plain magic numbers. Because people looking at the code later will not have the "luxury" of being provided a link to the relevant documentation right when puzzling about what `0x4` means in this case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2168582443 From rrich at openjdk.org Thu Jun 26 09:17:35 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 26 Jun 2025 09:17:35 GMT Subject: [jdk25] RFR: 8360405: [PPC64] some environments don't support mfdscr instruction In-Reply-To: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> References: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> Message-ID: On Wed, 25 Jun 2025 09:08:38 GMT, Martin Doerr wrote: > Clean backport of [JDK-8360405](https://bugs.openjdk.org/browse/JDK-8360405). Marked as reviewed by rrich (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25972#pullrequestreview-2961357178 From mdoerr at openjdk.org Thu Jun 26 09:17:36 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 26 Jun 2025 09:17:36 GMT Subject: [jdk25] RFR: 8360405: [PPC64] some environments don't support mfdscr instruction In-Reply-To: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> References: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> Message-ID: On Wed, 25 Jun 2025 09:08:38 GMT, Martin Doerr wrote: > Clean backport of [JDK-8360405](https://bugs.openjdk.org/browse/JDK-8360405). Thanks for the reviews and for verifying! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25972#issuecomment-3007779195 From mdoerr at openjdk.org Thu Jun 26 09:17:36 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 26 Jun 2025 09:17:36 GMT Subject: [jdk25] Integrated: 8360405: [PPC64] some environments don't support mfdscr instruction In-Reply-To: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> References: <0z1WyiZXaJ6dSi3SeFSrXLRexSLEPa2_kNX8udMFkn8=.2fcbfdea-5449-4958-b438-700270f586cf@github.com> Message-ID: On Wed, 25 Jun 2025 09:08:38 GMT, Martin Doerr wrote: > Clean backport of [JDK-8360405](https://bugs.openjdk.org/browse/JDK-8360405). This pull request has now been integrated. Changeset: 274a2dd7 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/274a2dd729875f98401ef845fbc69ad1049a0c1f Stats: 73 lines in 4 files changed: 33 ins; 0 del; 40 mod 8360405: [PPC64] some environments don't support mfdscr instruction Reviewed-by: haosun, rrich Backport-of: f71d64fbeb0c196fd825241ff86d3a103d05a842 ------------- PR: https://git.openjdk.org/jdk/pull/25972 From ayang at openjdk.org Thu Jun 26 10:01:20 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Jun 2025 10:01:20 GMT Subject: RFR: 8338474: Parallel: Deprecate and obsolete PSChunkLargeArrays Message-ID: <1ZjYlg9V9HUV0H2Tk2222vuMl1rOAGJdSqFivaez1LU=.9f7213d9-34db-48c9-9faa-8e042eaadaf6@github.com> Deprecating `PSChunkLargeArrays`, which is used only by Parallel and it is enabled by default. Disabling it offers little benefit, so removing it do reduce the number of commandline flags. ------------- Commit messages: - pgc-deprecate Changes: https://git.openjdk.org/jdk/pull/25997/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25997&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8338474 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25997.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25997/head:pull/25997 PR: https://git.openjdk.org/jdk/pull/25997 From ayang at openjdk.org Thu Jun 26 10:17:13 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Jun 2025 10:17:13 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v16] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <0FO5rCVeYxxJCB8J7GD-4WOtG7E6MA8gcja1YxGvWus=.22bd07e1-c3c3-40d1-b251-cc855ea48704@github.com> > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: remove-young-resize-after-full-gc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25000/files - new: https://git.openjdk.org/jdk/pull/25000/files/7f733137..271a8916 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=14-15 Stats: 59 lines in 3 files changed: 9 ins; 40 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From ayang at openjdk.org Thu Jun 26 10:17:13 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Jun 2025 10:17:13 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v3] In-Reply-To: <0l1GXiRVXTfUaPsDPyirWY0RnyyjxO95GfqnED2O1nw=.6f9d7504-3708-48f0-9e28-689772339276@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <0l1GXiRVXTfUaPsDPyirWY0RnyyjxO95GfqnED2O1nw=.6f9d7504-3708-48f0-9e28-689772339276@github.com> Message-ID: <9F00iiZ3YclXm-0yJsaCC5z_MqZjUqvlKg8a8UvpRnE=.d010c44c-10d8-4ab8-8af2-77db2952d128@github.com> On Mon, 19 May 2025 11:05:40 GMT, Guoxiong Li wrote: >> Added checking for `from_space`. >> >> If all live-objs don't fit in old-gen, leftovers will be kept in its own space. >> >> >> // Summarize the remaining spaces in the young gen. The initial target space >> // is the old gen. If a space does not fit entirely into the target, then the >> // remainder is compacted into the space itself and that space becomes the new >> // target. > >> If all live-objs don't fit in old-gen, leftovers will be kept in its own space. > > Thanks for clarifying. I removed this method, and added comments why it's undesirable to do young-gen resizing after a full-gc. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2168706034 From coleenp at openjdk.org Thu Jun 26 11:46:33 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 26 Jun 2025 11:46:33 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v13] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Wed, 25 Jun 2025 22:41:53 GMT, Ioi Lam wrote: >> This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: >> >> Excerpt from the test case ExceptionsTest.java. >> >> >> [0.042s][info][exceptions] Exception >> [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> >> [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) >> [0.042s][info][exceptions,stacktrace] Exception >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) >> [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) >> [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 >> Exception 1 caught. >> >> >> - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. >> >> - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: >> - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` >> - By native code in Exceptions::special_exception() and and Exceptions::_throw()). >> >> **Concurrent Exceptions** >> >> Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - Merge branch 'master' into 8358080-print-thread-stack-with-xlog-exceptions-trace > - Fix crash in runtime/logging/RedefineClasses.java: cannot print stack trace in Exceptions::special_exception() > - @coleenp comments: Removed dead code > - @coleenp and @dholmes-ora comments > - @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files > - refactor > - Reimplement -- print stack trace only when it is a known throwing site > - @dholmes-ora comments -- removed printing of output.getStdout() from test > - Print callstack for rethrown exceptions > - @dholmes-ora comments - use JavaThread::current() instead > - ... and 5 more: https://git.openjdk.org/jdk/compare/db1e3918...3eb7c622 Looks good and it'll be really helpful. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25522#pullrequestreview-2961791003 From coleenp at openjdk.org Thu Jun 26 12:10:34 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 26 Jun 2025 12:10:34 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 17:48:02 GMT, Daniel D. Daugherty wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix the test > > src/hotspot/share/runtime/mutexLocker.cpp line 236: > >> 234: MUTEX_DEFN(Notification_lock , PaddedMonitor, service); // used for notification thread operations >> 235: >> 236: MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs locks HandshakeState_lock > > Perhaps: > > MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs, can lock HandshakeState_lock which is at nosafepoint ugh that makes the line too long for me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2168911755 From coleenp at openjdk.org Thu Jun 26 12:20:57 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 26 Jun 2025 12:20:57 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v11] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: More comment grammar fixes, thank you for reading the comments again, Dan! ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/66eb4269..cc0d6dda Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=09-10 Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From kevinw at openjdk.org Thu Jun 26 12:22:32 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 26 Jun 2025 12:22:32 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: <4hLF410lF9EFIriy-NuDlj97eLzkM2hclxro5Hf6xlo=.b96e68e9-83d6-4823-bac4-c3d944bc67dc@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <4hLF410lF9EFIriy-NuDlj97eLzkM2hclxro5Hf6xlo=.b96e68e9-83d6-4823-bac4-c3d944bc67dc@github.com> Message-ID: On Thu, 26 Jun 2025 06:04:20 GMT, Alan Bateman wrote: > This seems like a separate discussion as the minimal VM doesn't have the M&M support Sure, it's maybe off topic. I now see that few tests handle this. I hit it as a minimal build showed me I had the wrong THROW macro, so in getting that right, I want to build and test such a VM. This test fails to load ManagementFactory with such a build (expected but confusing). It should skip. I realise many tests will fail on such a build. We could use @requires vm.flavor != "minimal" or @requires vm.jvmti or just leave the failure in. I don't need to commit this test param change now. Relatedly: I saw the tes timeout in debug builds on win and mac (as expected). On Linux, fastdebug builds run OK and could be useful. If further testing doesn't find an issue, I can do: - * @requires !vm.debug & (vm.compMode != "Xcomp") + * @requires vm.compMode != "Xcomp" + * @requires !vm.debug || (os.family == "linux") ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2168933479 From sgehwolf at openjdk.org Thu Jun 26 12:35:28 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 26 Jun 2025 12:35:28 GMT Subject: RFR: 8360518: Docker tests do not work when asan is configured In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 14:09:42 GMT, Matthias Baesken wrote: > When the address sanitizer ASAN is configured, we run into errors in the docker tests. > Example hotspot/jtreg/containers/docker/DockerBasicTest.java : > > [STDOUT] > /jdk/bin/java: error while loading shared libraries: libasan.so.8: cannot open shared object file: No such file or directory > > Reason is that the asan-enabled binaries need additional dependencies and those are not available in the current docker/container setups. > Maybe we should skip those tests when asan is enabled. Seems fine to me. ------------- Marked as reviewed by sgehwolf (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25980#pullrequestreview-2961937524 From syan at openjdk.org Thu Jun 26 12:35:29 2025 From: syan at openjdk.org (SendaoYan) Date: Thu, 26 Jun 2025 12:35:29 GMT Subject: RFR: 8360518: Docker tests do not work when asan is configured In-Reply-To: References: Message-ID: <_SrsAyYvN9cq232gYiGFmkzUb5QMBqlkom7VrQsPBgc=.2c72fdf6-3ed5-48c4-8743-6f0602a69ada@github.com> On Wed, 25 Jun 2025 14:09:42 GMT, Matthias Baesken wrote: > When the address sanitizer ASAN is configured, we run into errors in the docker tests. > Example hotspot/jtreg/containers/docker/DockerBasicTest.java : > > [STDOUT] > /jdk/bin/java: error while loading shared libraries: libasan.so.8: cannot open shared object file: No such file or directory > > Reason is that the asan-enabled binaries need additional dependencies and those are not available in the current docker/container setups. > Maybe we should skip those tests when asan is enabled. Should we should copy the dependent files such as libasan.so.8 to docker images when build it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25980#issuecomment-3006718477 From sgehwolf at openjdk.org Thu Jun 26 12:40:27 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 26 Jun 2025 12:40:27 GMT Subject: RFR: 8360518: Docker tests do not work when asan is configured In-Reply-To: <_SrsAyYvN9cq232gYiGFmkzUb5QMBqlkom7VrQsPBgc=.2c72fdf6-3ed5-48c4-8743-6f0602a69ada@github.com> References: <_SrsAyYvN9cq232gYiGFmkzUb5QMBqlkom7VrQsPBgc=.2c72fdf6-3ed5-48c4-8743-6f0602a69ada@github.com> Message-ID: On Thu, 26 Jun 2025 01:26:38 GMT, SendaoYan wrote: > Should we should copy the dependent files such as libasan.so.8 to docker images when build it. See: https://bugs.openjdk.org/browse/JDK-8333144?focusedId=14792939&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14792939 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25980#issuecomment-3008347106 From iwalulya at openjdk.org Thu Jun 26 13:28:13 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 26 Jun 2025 13:28:13 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v5] In-Reply-To: References: Message-ID: > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: Reviews ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25832/files - new: https://git.openjdk.org/jdk/pull/25832/files/df4f7ce5..8781a113 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=03-04 Stats: 316 lines in 9 files changed: 112 ins; 65 del; 139 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From asemenov at openjdk.org Thu Jun 26 13:28:44 2025 From: asemenov at openjdk.org (Artem Semenov) Date: Thu, 26 Jun 2025 13:28:44 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() Message-ID: The defect has been detected and confirmed in the function ```IterateOverHeapObjectClosure::do_object()``` located in the file ```src/hotspot/share/prims/jvmtiTagMap.cpp``` with static code analysis. This defect can potentially lead to a null pointer dereference. The pointer ```oop o``` is passed to the constructor of the CallbackWrapper class, where it is dereferenced without a null check. Found by Linux Verification Center (linuxtesting.org) with SVACE. signed-off-by: Artem Semenov (savoptik at altlinux.org). ------------- Commit messages: - 8360664 Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() Changes: https://git.openjdk.org/jdk/pull/26002/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26002&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360664 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26002/head:pull/26002 PR: https://git.openjdk.org/jdk/pull/26002 From asemenov at openjdk.org Thu Jun 26 14:00:43 2025 From: asemenov at openjdk.org (Artem Semenov) Date: Thu, 26 Jun 2025 14:00:43 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v2] In-Reply-To: References: Message-ID: > The defect has been detected and confirmed in the function ```IterateOverHeapObjectClosure::do_object()``` located in the file ```src/hotspot/share/prims/jvmtiTagMap.cpp``` with static code analysis. This defect can potentially lead to a null pointer dereference. > > The pointer ```oop o``` is passed to the constructor of the CallbackWrapper class, where it is dereferenced without a null check. Artem Semenov has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8360664 Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() Found by Linux Verification Center (linuxtesting.org) with SVACE. signed-off-by: Artem Semenov ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26002/files - new: https://git.openjdk.org/jdk/pull/26002/files/ee6a0ff7..e69c49c8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26002&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26002&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26002/head:pull/26002 PR: https://git.openjdk.org/jdk/pull/26002 From duke at openjdk.org Thu Jun 26 14:28:14 2025 From: duke at openjdk.org (Samuel Chee) Date: Thu, 26 Jun 2025 14:28:14 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM In-Reply-To: References: Message-ID: On Tue, 13 May 2025 12:53:56 GMT, Samuel Chee wrote: > Removes the unnecessary acquire to reduce memory ordering requirement since it is not longer needed as threads always disarm their own poll value. Waiting for OCA approval!!! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25211#issuecomment-2963222244 From duke at openjdk.org Thu Jun 26 14:28:14 2025 From: duke at openjdk.org (Samuel Chee) Date: Thu, 26 Jun 2025 14:28:14 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM Message-ID: Removes the unnecessary acquire to reduce memory ordering requirement since it is not longer needed as threads always disarm their own poll value. ------------- Commit messages: - 8356556: AArch64: No need for acquire fence in safepoint poll in FFM Changes: https://git.openjdk.org/jdk/pull/25211/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25211&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356556 Stats: 19 lines in 8 files changed: 0 ins; 7 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25211.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25211/head:pull/25211 PR: https://git.openjdk.org/jdk/pull/25211 From smonteith at openjdk.org Thu Jun 26 14:28:14 2025 From: smonteith at openjdk.org (Stuart Monteith) Date: Thu, 26 Jun 2025 14:28:14 GMT Subject: RFR: 8356556: AArch64: No need for acquire fence in safepoint poll in FFM In-Reply-To: References: Message-ID: <-HVkhEh8wIOdH6VAzGmE8_TcZFSS6dk040W4HFnyGUE=.dd813f8a-2ddd-4c4d-9e8c-6092a6950aea@github.com> On Wed, 11 Jun 2025 15:13:05 GMT, Samuel Chee wrote: >> Removes the unnecessary acquire to reduce memory ordering requirement since it is not longer needed as threads always disarm their own poll value. > > Waiting for OCA approval!!! Hello @spchee I've sent Dalibor a couple of emails. @robilad would you be able to add Samuel to the Arm OCA? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25211#issuecomment-2995900344 From kbarrett at openjdk.org Thu Jun 26 15:15:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 26 Jun 2025 15:15:35 GMT Subject: RFR: 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description In-Reply-To: References: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> Message-ID: On Wed, 25 Jun 2025 08:08:51 GMT, Stefan Karlsson wrote: >> Please review this change that renames Deferred<> to DeferredStatic<>, to >> better reflect the intended usage. (This involves renaming the source file.) >> This change also revises the documentation comment for the class to better >> describe the intended usage. >> >> In addition, there are a number of cleanups: >> >> (1) The include guard didn't get updated when the name was previously changed >> to Deferred. It's updated here to reflect the new name. >> >> (2) There were problems with the include block that are fixed here. >> >> (3) The changes from JDK-8359923 are backed out. They aren't useful with the >> intended usage model. >> >> (4) A gtest is added to test the class's functionality. >> >> Testing: mach5 tier1, including new gtest > > Marked as reviewed by stefank (Reviewer). Thanks for reviews @stefank , @jdksjolen , and @jsikstro ------------- PR Comment: https://git.openjdk.org/jdk/pull/25964#issuecomment-3008840069 From kbarrett at openjdk.org Thu Jun 26 15:15:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 26 Jun 2025 15:15:38 GMT Subject: Integrated: 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description In-Reply-To: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> References: <3rcBLXCifQLIBxub4PEXZlyKvjZYw8sTKCpQhOoOJ3M=.b24f11cf-7273-4ee4-9f2e-2f6d2875c8a8@github.com> Message-ID: On Wed, 25 Jun 2025 00:36:44 GMT, Kim Barrett wrote: > Please review this change that renames Deferred<> to DeferredStatic<>, to > better reflect the intended usage. (This involves renaming the source file.) > This change also revises the documentation comment for the class to better > describe the intended usage. > > In addition, there are a number of cleanups: > > (1) The include guard didn't get updated when the name was previously changed > to Deferred. It's updated here to reflect the new name. > > (2) There were problems with the include block that are fixed here. > > (3) The changes from JDK-8359923 are backed out. They aren't useful with the > intended usage model. > > (4) A gtest is added to test the class's functionality. > > Testing: mach5 tier1, including new gtest This pull request has now been integrated. Changeset: 7f702cf4 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/7f702cf483018155a22a32736da8d80a11c9eca9 Stats: 306 lines in 7 files changed: 210 ins; 91 del; 5 mod 8360458: Rename Deferred<> to DeferredStatic<> and improve usage description Reviewed-by: jsikstro, jsjolen, stefank ------------- PR: https://git.openjdk.org/jdk/pull/25964 From iwalulya at openjdk.org Thu Jun 26 15:40:30 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 26 Jun 2025 15:40:30 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v16] In-Reply-To: <0FO5rCVeYxxJCB8J7GD-4WOtG7E6MA8gcja1YxGvWus=.22bd07e1-c3c3-40d1-b251-cc855ea48704@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <0FO5rCVeYxxJCB8J7GD-4WOtG7E6MA8gcja1YxGvWus=.22bd07e1-c3c3-40d1-b251-cc855ea48704@github.com> Message-ID: On Thu, 26 Jun 2025 10:17:13 GMT, Albert Mingkun Yang wrote: >> This patch refines Parallel's sizing strategy to improve overall memory management and performance. >> >> The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. >> >> `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. >> >> GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. >> >> ## Performance evaluation >> >> - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). >> - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). >> - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. >> >> PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. >> >> Test: tier1-8 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > remove-young-resize-after-full-gc Changes requested by iwalulya (Reviewer). src/hotspot/share/gc/parallel/psYoungGen.cpp line 305: > 303: eden_space()->top(), > 304: sizeof(HeapWord)); > 305: if (word_size > available_word_size) { Would it be useful to `log_trace` this situaton? src/hotspot/share/gc/parallel/psYoungGen.cpp line 321: > 319: } > 320: > 321: void PSYoungGen::compute_desired_sizes(bool is_survivor_overflowing, Probably subjective, but as suggested on a recent review, is it easier to read if you returned the results in a Pair? ------------- PR Review: https://git.openjdk.org/jdk/pull/25000#pullrequestreview-2950871762 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2169356093 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2169350754 From iwalulya at openjdk.org Thu Jun 26 15:40:33 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 26 Jun 2025 15:40:33 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v14] In-Reply-To: <9oCyQapT5zkgtiWmLQoPBY10EUD6Q4LIEO4Sr6nyxXI=.963bc30b-a996-4c5a-9594-16c36c6c70db@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <9oCyQapT5zkgtiWmLQoPBY10EUD6Q4LIEO4Sr6nyxXI=.963bc30b-a996-4c5a-9594-16c36c6c70db@github.com> Message-ID: On Mon, 23 Jun 2025 08:34:22 GMT, Albert Mingkun Yang wrote: >> This patch refines Parallel's sizing strategy to improve overall memory management and performance. >> >> The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. >> >> `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. >> >> GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. >> >> ## Performance evaluation >> >> - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). >> - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). >> - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. >> >> PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. >> >> Test: tier1-8 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: > > - Merge branch 'master' into pgc-size-policy > - review > - Merge branch 'master' into pgc-size-policy > - merge > - version > - Merge branch 'master' into pgc-size-policy > - revert-aliases > - Merge branch 'master' into pgc-size-policy > - merge > - merge-fix > - ... and 10 more: https://git.openjdk.org/jdk/compare/516197f5...41027bdf src/hotspot/share/gc/parallel/psScavenge.cpp line 539: > 537: if (!young_gen->to_space()->is_empty()) { > 538: // To-space is not empty; should run full-gc instead. > 539: log_debug(gc, ergo)("non-empty to-space; full-gc instead"); "To-space is not empty; should run full-gc instead" seems like a better log string if we need logging here src/hotspot/share/gc/parallel/psVirtualspace.cpp line 66: > 64: _committed_high_addr += bytes; > 65: } else { > 66: log_warning(gc)("expand_by commit %zu bytes failed", bytes); probably leftover from debugging, if not, then maybe improve the log string ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2162115342 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2162111028 From cjplummer at openjdk.org Thu Jun 26 15:43:27 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 26 Jun 2025 15:43:27 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v2] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 14:00:43 GMT, Artem Semenov wrote: >> The defect has been detected and confirmed in the function ```IterateOverHeapObjectClosure::do_object()``` located in the file ```src/hotspot/share/prims/jvmtiTagMap.cpp``` with static code analysis. This defect can potentially lead to a null pointer dereference. >> >> The pointer ```oop o``` is passed to the constructor of the CallbackWrapper class, where it is dereferenced without a null check. > > Artem Semenov has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8360664 Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > signed-off-by: Artem Semenov It's concerning that we don't have tests cases that uncover these bugs. Perhaps it's not actually possible for NULL to be passed when constructing CallbackWrapper. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26002#issuecomment-3008932024 From liach at openjdk.org Thu Jun 26 16:11:16 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 26 Jun 2025 16:11:16 GMT Subject: RFR: 8356548: Avoid using ASM to parse latest class files in tests [v5] In-Reply-To: References: Message-ID: > For early eval; test by changing the ClassReader max accepted version of test ASM to 24 instead of 25 Chen Liang has updated the pull request incrementally with one additional commit since the last revision: Update test/hotspot/jtreg/compiler/calls/common/InvokeDynamicPatcher.java Co-authored-by: Andrew Haley ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25124/files - new: https://git.openjdk.org/jdk/pull/25124/files/4fc1b513..020659f3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25124&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25124&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25124.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25124/head:pull/25124 PR: https://git.openjdk.org/jdk/pull/25124 From dcubed at openjdk.org Thu Jun 26 16:38:36 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 26 Jun 2025 16:38:36 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v11] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 12:20:57 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > More comment grammar fixes, thank you for reading the comments again, Dan! Thanks for making the wording tweaks. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2962742782 From dcubed at openjdk.org Thu Jun 26 16:38:37 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 26 Jun 2025 16:38:37 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 12:08:04 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/mutexLocker.cpp line 236: >> >>> 234: MUTEX_DEFN(Notification_lock , PaddedMonitor, service); // used for notification thread operations >>> 235: >>> 236: MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs locks HandshakeState_lock >> >> Perhaps: >> >> MUTEX_DEFN(JmethodIdCreation_lock , PaddedMutex , nosafepoint-1); // used for creating jmethodIDs, can lock HandshakeState_lock which is at nosafepoint > > ugh that makes the line too long for me. How about just: s/locks/can also lock/ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2169462310 From alanb at openjdk.org Thu Jun 26 16:41:28 2025 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 26 Jun 2025 16:41:28 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <4hLF410lF9EFIriy-NuDlj97eLzkM2hclxro5Hf6xlo=.b96e68e9-83d6-4823-bac4-c3d944bc67dc@github.com> Message-ID: On Thu, 26 Jun 2025 12:20:03 GMT, Kevin Walls wrote: > > This seems like a separate discussion as the minimal VM doesn't have the M&M support > > Sure, it's maybe off topic. I now see that few tests handle this. I hit it as a minimal build showed me I had the wrong THROW macro, so in getting that right, I want to build and test such a VM. This test fails to load ManagementFactory with such a build (expected but confusing). It should skip. I think you drop the change to the test as testing minimal VM is a bigger discussion. The make target for the compact profiles was removed in JDK 9 so would require running jlink to create a run-time image to test. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2169467764 From iklam at openjdk.org Thu Jun 26 17:26:39 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 26 Jun 2025 17:26:39 GMT Subject: RFR: 8344165: Trace exceptions with a complete call-stack [v13] In-Reply-To: References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Thu, 26 Jun 2025 11:43:33 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: >> >> - Merge branch 'master' into 8358080-print-thread-stack-with-xlog-exceptions-trace >> - Fix crash in runtime/logging/RedefineClasses.java: cannot print stack trace in Exceptions::special_exception() >> - @coleenp comments: Removed dead code >> - @coleenp and @dholmes-ora comments >> - @coleenp comments; Also, use ProcessTools.executeProcess() to log the output in files >> - refactor >> - Reimplement -- print stack trace only when it is a known throwing site >> - @dholmes-ora comments -- removed printing of output.getStdout() from test >> - Print callstack for rethrown exceptions >> - @dholmes-ora comments - use JavaThread::current() instead >> - ... and 5 more: https://git.openjdk.org/jdk/compare/043d2636...3eb7c622 > > Looks good and it'll be really helpful. Thanks @coleenp and @dholmes-ora for the review Passed tiers1-4 and build-tiers-5 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25522#issuecomment-3009211576 From iklam at openjdk.org Thu Jun 26 17:26:40 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 26 Jun 2025 17:26:40 GMT Subject: Integrated: 8344165: Trace exceptions with a complete call-stack In-Reply-To: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> References: <8fNsLyVZgJS2eJ1G3aUpIZo0dU-vVOwcPHcjxVXT9U8=.1a1b5fbb-47eb-40d3-a5a5-0be29a8082cd@github.com> Message-ID: On Thu, 29 May 2025 16:39:44 GMT, Ioi Lam wrote: > This PR makes it easier to analyze exceptions without modifying the JVM or the app to print call stacks: > > Excerpt from the test case ExceptionsTest.java. > > > [0.042s][info][exceptions] Exception > [ ] thrown in interpreter method <{method} {0x000079e52c4005b0} 'bar' '()V' in 'ExceptionsTest$InternalClass'> > [ ] at bci 9 for thread 0x000079e58c02c7b0 (main) > [0.042s][info][exceptions,stacktrace] Exception > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.bar(ExceptionsTest.java:110) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.foo(ExceptionsTest.java:105) > [0.042s][info][exceptions,stacktrace] at ExceptionsTest$InternalClass.main(ExceptionsTest.java:100) > [0.042s][info][exceptions ] Found matching handler for exception of type "java.lang.RuntimeException" in method "bar" at BCI: 10 > Exception 1 caught. > > > - Note that the old `[exceptions]` log is triggered by `InterpreterRuntime::exception_handler_for_exception()` and prints one level of the stack while the interpreter is looking for a handler. However, once a handler is found (inside `bar()`), the `[exceptions]` log terminates, and we do not know about the `foo()` or `main()` methods. > > - The `[exceptions,stacktrace]` log is printed only when an exception is thrown: > - By an `_athrow` bytecode. See comments around `Exceptions::log_exception_stacktrace(Handle exception, methodHandle method, int bci)` > - By native code in Exceptions::special_exception() and and Exceptions::_throw()). > > **Concurrent Exceptions** > > Neither the old `[exceptions]` or the new `[exceptions,stacktrace]` logs distinguish between multiple exceptions that are thrown and handled concurrently. The users should use something like `-Xlog:exceptions,exceptions+stack::tags,tid` to include the thread ID into the log output, and use an external tool (such as a script) to demultiplex the logs. This pull request has now been integrated. Changeset: 20e0055e Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/20e0055e202e523b40e8b066e2f71c21d8cc5ea9 Stats: 117 lines in 5 files changed: 99 ins; 0 del; 18 mod 8344165: Trace exceptions with a complete call-stack Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25522 From coleenp at openjdk.org Thu Jun 26 18:01:58 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 26 Jun 2025 18:01:58 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v12] In-Reply-To: References: Message-ID: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Improved comments, not too wide for my screen... ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25267/files - new: https://git.openjdk.org/jdk/pull/25267/files/cc0d6dda..8ef3f190 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25267&range=10-11 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25267.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25267/head:pull/25267 PR: https://git.openjdk.org/jdk/pull/25267 From coleenp at openjdk.org Thu Jun 26 18:01:59 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 26 Jun 2025 18:01:59 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v11] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 12:20:57 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > More comment grammar fixes, thank you for reading the comments again, Dan! // This function must only be called when JVM/TI // CLASS_LOAD events have been enabled since agent startup. The enabled // event will cause the jmethodIDs to be allocated at class load time. // The jmethodIDs cannot be allocated in a signal handler because locks // cannot be grabbed in a signal handler safely. I do not see any code that creates jmethodIDs for all the methods in a ClassLoad JVMTI event. ACGCT does call trace->frames[count].method_id = method->find_jmethod_id_or_null(); so it won't create jmethodIDs from a signal handler. Keeping them live for the frames depends on when the native caller of ASGCT uses the trace that is returned. If the frames are still on the call stack, the methods cannot be unloaded because they're still on the call stack. If the frames returned by ASGCT are stored somewhere else, and accessed later, they could be null. I don't see any code that makes this work tbh. Most methods do not have jmethodIDs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3009304966 From coleenp at openjdk.org Thu Jun 26 18:28:32 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 26 Jun 2025 18:28:32 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v10] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 16:35:26 GMT, Daniel D. Daugherty wrote: >> ugh that makes the line too long for me. > > How about just: s/locks/can also lock/ Done - that's not too wide for my screen. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25267#discussion_r2169688031 From ayang at openjdk.org Thu Jun 26 19:17:11 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Jun 2025 19:17:11 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v17] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: - Merge branch 'master' into pgc-size-policy - review - cast - remove-young-resize-after-full-gc - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - merge - version - ... and 15 more: https://git.openjdk.org/jdk/compare/20e0055e...eeda1eb8 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=16 Stats: 4362 lines in 31 files changed: 506 ins; 3470 del; 386 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From ayang at openjdk.org Thu Jun 26 19:25:32 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Jun 2025 19:25:32 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v14] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <9oCyQapT5zkgtiWmLQoPBY10EUD6Q4LIEO4Sr6nyxXI=.963bc30b-a996-4c5a-9594-16c36c6c70db@github.com> Message-ID: <1VCKQ3DB7qOy2FYZRgJ1MsjgJn4V0HRjz4jb4qPfYus=.3d9d8374-c422-48b7-a058-cd27f20c9e76@github.com> On Mon, 23 Jun 2025 17:26:29 GMT, Ivan Walulya wrote: >> Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: >> >> - Merge branch 'master' into pgc-size-policy >> - review >> - Merge branch 'master' into pgc-size-policy >> - merge >> - version >> - Merge branch 'master' into pgc-size-policy >> - revert-aliases >> - Merge branch 'master' into pgc-size-policy >> - merge >> - merge-fix >> - ... and 10 more: https://git.openjdk.org/jdk/compare/516197f5...41027bdf > > src/hotspot/share/gc/parallel/psVirtualspace.cpp line 66: > >> 64: _committed_high_addr += bytes; >> 65: } else { >> 66: log_warning(gc)("expand_by commit %zu bytes failed", bytes); > > probably leftover from debugging, if not, then maybe improve the log string Revised. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2169803226 From ayang at openjdk.org Thu Jun 26 19:25:34 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 26 Jun 2025 19:25:34 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v16] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <0FO5rCVeYxxJCB8J7GD-4WOtG7E6MA8gcja1YxGvWus=.22bd07e1-c3c3-40d1-b251-cc855ea48704@github.com> Message-ID: On Thu, 26 Jun 2025 15:35:25 GMT, Ivan Walulya wrote: >> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> remove-young-resize-after-full-gc > > src/hotspot/share/gc/parallel/psYoungGen.cpp line 305: > >> 303: eden_space()->top(), >> 304: sizeof(HeapWord)); >> 305: if (word_size > available_word_size) { > > Would it be useful to `log_trace` this situaton? You mean we are probably approaching OOM here? However, we can reach here from diff calling-context, we don't know if we are near OOM or not. > src/hotspot/share/gc/parallel/psYoungGen.cpp line 321: > >> 319: } >> 320: >> 321: void PSYoungGen::compute_desired_sizes(bool is_survivor_overflowing, > > Probably subjective, but as suggested on a recent review, is it easier to read if you returned the results in a Pair? It probably doesn't make much difference. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2169802849 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2169790986 From adinn at openjdk.org Thu Jun 26 21:17:07 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 26 Jun 2025 21:17:07 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries Message-ID: Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. ------------- Commit messages: - remove redundant doc comments - fix copy paste errors in riscv - fix copy paste errors in s390 - fix copy paste errors in zero - fix errors in x86 stub entry declarations - fix various copy paste errors - fix whitespace error - update other arches to use global ids - use global ids everywhere -- aarch64 only - support global blob/stub/entry ids Changes: https://git.openjdk.org/jdk/pull/26004/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26004&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360707 Stats: 4070 lines in 97 files changed: 1957 ins; 273 del; 1840 mod Patch: https://git.openjdk.org/jdk/pull/26004.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26004/head:pull/26004 PR: https://git.openjdk.org/jdk/pull/26004 From vpaprotski at openjdk.org Thu Jun 26 21:58:38 2025 From: vpaprotski at openjdk.org (Volodymyr Paprotski) Date: Thu, 26 Jun 2025 21:58:38 GMT Subject: RFR: 8359965: Enable paired pushp and popp instruction usage for APX enabled CPUs In-Reply-To: References: Message-ID: On Thu, 19 Jun 2025 06:39:52 GMT, David Holmes wrote: >> The goal of this PR is to enhance the existing x86 assembly stubs using PUSH and POP instructions with paired PUSHP/POPP instructions which are part of Intel APX technology. >> >> In Intel APX, the PUSHP and POPP instructions are modern, compact replacements for the legacy PUSH and POP, designed to work seamlessly with the expanded set of 32 general-purpose registers (R0?R31). Unlike their predecessors, they use the new APX (REX2-based) encoding, enabling more uniform and efficient instruction formats. These instructions improve code density, simplify register access, and are optimized for performance on APX-enabled CPUs. >> >> Pairing PUSHP and POPP in Intel APX provides CPU-level benefits such as more efficient instruction decoding, better stack pointer tracking, and improved register dependency management. Their uniform encoding allows for streamlined execution, reduced pipeline stalls, and potential micro-op fusion, all of which enhance performance and power efficiency. This pairing helps the processor optimize speculative execution and register lifetimes, making code faster and more scalable on modern architectures. > > Just a drive-by comment as this isn't code I normally have much to do with but to me it would look a lot cleaner to define `push_paired`/`pop_paired` (maybe abbreviating directly to `pushp`/`popp`?) rather than passing the boolean. Like @dholmes-ora, I also prefer a new function (in MacroAssembler) instead of flags. Though I like the names `paired_push`/`paired_pop`.. The shorter `pushp`/`popp` might also be acceptable (better readability) though I think I like the longer name (I am more likely to look up the longer function definition to see what it does. The shorter, I might assume is just the regular push/pop.. but it could also fall under the category 'you are supposed to know that') PS: `sed -e "/is_pair/ s|pop(([^,]*), true /*is_pair*/)|paired_pop(\1)|" -e "/is_pair/ s|push(([^,]*), true /*is_pair*/)|paired_push(\1)|"` ------------- PR Comment: https://git.openjdk.org/jdk/pull/25889#issuecomment-3010260642 From dholmes at openjdk.org Thu Jun 26 23:50:38 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 26 Jun 2025 23:50:38 GMT Subject: RFR: 8360515: PROPERFMTARGS should always use size_t template specialization for unit [v2] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 08:17:03 GMT, Joel Sikstr?m wrote: >> src/hotspot/os/bsd/memMapPrinter_macosx.cpp line 239: >> >>> 237: st->print_cr(" rss: %llu (%llu%s)", vm_info.resident_size, byte_size_in_proper_unit(vm_info.resident_size), proper_unit_for_byte_size(vm_info.resident_size)); >>> 238: st->print_cr(" peak rss: %llu (%llu%s)", vm_info.resident_size_peak, byte_size_in_proper_unit(vm_info.resident_size_peak), proper_unit_for_byte_size(vm_info.resident_size_peak)); >>> 239: st->print_cr(" page size: %d (" PROPERFMT ")", vm_info.page_size, PROPERFMTARGS((size_t)vm_info.page_size)); >> >> Just to clarify something for the reader here, as it tripped me up, the `vm_info` fields are declared as `mach_vm_size_t`, which one might expect is some kind of `size_t` but alas no [1]: >> >> typedef uint64_t mach_vm_size_t; >> >> But given you cast `page_size` to size_t (it is `int32_t`) why not cast the others too and use `PROPERFMT`? >> >> [1] https://developer.apple.com/documentation/kernel/mach_vm_size_t > > I originally wanted this patch to be minimally invasive and just fix the mismatch between PROPERFMT and PROPERFMTARGS. The cast to size_t wasn't added by me, I just changed the format specifier to match the type in PROPERFMTARGS. I agree that it looks a bit weird to have both the expanded macro and PROPERFMTARGS next to each other, and I'm not 100% sure why vm_info.page_size is casted to size_t. > > I'm fine with making the usage consistent in this patch, making all the prints use PROPERFMT+PROPERFMTARGS with casts to size_t, or the other way around using the expanded macro. Sorry for mis-attributing the cast to your change. This could go either way with no way clearly better than the other. Lets see if another reviewer has a strong opinion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25975#discussion_r2170301537 From dholmes at openjdk.org Fri Jun 27 04:00:49 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 27 Jun 2025 04:00:49 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v12] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 18:01:58 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Improved comments, not too wide for my screen... Still fine ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2964647360 From sspitsyn at openjdk.org Fri Jun 27 05:10:38 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 27 Jun 2025 05:10:38 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Wed, 25 Jun 2025 13:02:03 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - comment update > - comment update This looks mostly okay but I'm waiting for an update suggested by Dan. ------------- PR Review: https://git.openjdk.org/jdk/pull/25958#pullrequestreview-2964763745 From sspitsyn at openjdk.org Fri Jun 27 05:16:42 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 27 Jun 2025 05:16:42 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v12] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 18:01:58 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Improved comments, not too wide for my screen... Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25267#pullrequestreview-2964776119 From dholmes at openjdk.org Fri Jun 27 05:19:46 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 27 Jun 2025 05:19:46 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v2] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 14:00:43 GMT, Artem Semenov wrote: >> The defect has been detected and confirmed in the function ```IterateOverHeapObjectClosure::do_object()``` located in the file ```src/hotspot/share/prims/jvmtiTagMap.cpp``` with static code analysis. This defect can potentially lead to a null pointer dereference. >> >> The pointer ```oop o``` is passed to the constructor of the CallbackWrapper class, where it is dereferenced without a null check. > > Artem Semenov has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8360664 Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > signed-off-by: Artem Semenov I think this is a false positive from the static code analyzer. If we are iterating over the heap then the closure is only ever passed actual oops, so it can't be null. At most I would add an assert, but generally my understanding is that the user of any closure has the responsibility of passing it valid input. ------------- PR Review: https://git.openjdk.org/jdk/pull/26002#pullrequestreview-2964779144 From sspitsyn at openjdk.org Fri Jun 27 05:24:39 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 27 Jun 2025 05:24:39 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v2] In-Reply-To: References: Message-ID: On Fri, 27 Jun 2025 05:16:32 GMT, David Holmes wrote: > At most I would add an assert, but generally my understanding is that the user of any closure has the responsibility of passing it valid input. Adding asserts sounds like a good suggestion. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26002#issuecomment-3011711961 From sspitsyn at openjdk.org Fri Jun 27 05:29:39 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 27 Jun 2025 05:29:39 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v2] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 14:00:43 GMT, Artem Semenov wrote: >> The defect has been detected and confirmed in the function ```IterateOverHeapObjectClosure::do_object()``` located in the file ```src/hotspot/share/prims/jvmtiTagMap.cpp``` with static code analysis. This defect can potentially lead to a null pointer dereference. >> >> The pointer ```oop o``` is passed to the constructor of the CallbackWrapper class, where it is dereferenced without a null check. > > Artem Semenov has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8360664 Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > signed-off-by: Artem Semenov I'm a little bit confused why we have twp bugs for this issue. The bug JDK-8360670 seems to be a dup of: JDK-8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() Should it be closed as a dup? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26002#issuecomment-3011718430 From mbaesken at openjdk.org Fri Jun 27 06:46:42 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 27 Jun 2025 06:46:42 GMT Subject: RFR: 8360518: Docker tests do not work when asan is configured In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 14:09:42 GMT, Matthias Baesken wrote: > When the address sanitizer ASAN is configured, we run into errors in the docker tests. > Example hotspot/jtreg/containers/docker/DockerBasicTest.java : > > [STDOUT] > /jdk/bin/java: error while loading shared libraries: libasan.so.8: cannot open shared object file: No such file or directory > > Reason is that the asan-enabled binaries need additional dependencies and those are not available in the current docker/container setups. > Maybe we should skip those tests when asan is enabled. Thanks for the review ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25980#issuecomment-3011882972 From mbaesken at openjdk.org Fri Jun 27 06:46:43 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 27 Jun 2025 06:46:43 GMT Subject: Integrated: 8360518: Docker tests do not work when asan is configured In-Reply-To: References: Message-ID: <6ERwQeDA7fMWBFl59U3xpQbPkmmnO-geBGn33LrLzz0=.d05bece3-d6a8-4278-9f53-42efde8dae61@github.com> On Wed, 25 Jun 2025 14:09:42 GMT, Matthias Baesken wrote: > When the address sanitizer ASAN is configured, we run into errors in the docker tests. > Example hotspot/jtreg/containers/docker/DockerBasicTest.java : > > [STDOUT] > /jdk/bin/java: error while loading shared libraries: libasan.so.8: cannot open shared object file: No such file or directory > > Reason is that the asan-enabled binaries need additional dependencies and those are not available in the current docker/container setups. > Maybe we should skip those tests when asan is enabled. This pull request has now been integrated. Changeset: 01b15bc1 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/01b15bc1f961f43ae11db0c15f45763c4ec4180b Stats: 23 lines in 23 files changed: 23 ins; 0 del; 0 mod 8360518: Docker tests do not work when asan is configured Reviewed-by: sgehwolf ------------- PR: https://git.openjdk.org/jdk/pull/25980 From duke at openjdk.org Fri Jun 27 07:30:26 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 27 Jun 2025 07:30:26 GMT Subject: RFR: 8357086: os::xxx functions returning memory size should return size_t [v13] In-Reply-To: References: Message-ID: > Hi, > > in this PR the output value types for functions which return memory are changed, namely: > > > static julong available_memory(); --> static bool available_memory(size_t& value); > static julong used_memory(); --> static bool used_memory(size_t& value); > static julong free_memory(); --> static bool free_memory(size_t& value); > static jlong total_swap_space(); --> static bool total_swap_space(size_t& value); > static jlong free_swap_space(); --> static bool free_swap_space(size_t& value); > static julong physical_memory(); --> static bool physical_memory(size_t& value); > > > The return boolean value indicates success, whereas the actual value is assigned to the input argument. The following recommended usage pattern is introduced: where applicable, and unsuccessful call is logged. > > Later, the return value can be attributed with `[[nodiscard]]` to enforce the pattern. > > Tested in GHA and Tiers 1-5. Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - 8357086: Fxied return value - 8357086: Fixed whitespaces - 8357086: Introduced usage pattern - 8357086: Fixed typo - 8357086: Refactored physical_memory in different OS - 8357086: Small fixes 2 - 8357086: Small fixes 1. - 8357086: Refactored physical_memory() - 8357086: Refactored free_swap_space() - 8357086: Refactored total_swap_space() - ... and 2 more: https://git.openjdk.org/jdk/compare/75ce44aa...e4698333 ------------- Changes: https://git.openjdk.org/jdk/pull/25450/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25450&range=12 Stats: 325 lines in 22 files changed: 164 ins; 2 del; 159 mod Patch: https://git.openjdk.org/jdk/pull/25450.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25450/head:pull/25450 PR: https://git.openjdk.org/jdk/pull/25450 From qxing at openjdk.org Fri Jun 27 07:55:39 2025 From: qxing at openjdk.org (Qizheng Xing) Date: Fri, 27 Jun 2025 07:55:39 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: <-1RcsgA5VF7PUtgbqLUuBzlEcubHPjgvql7ALfoXOkA=.175b5c52-7595-4a4e-9f06-e72144aa8908@github.com> On Wed, 25 Jun 2025 07:09:43 GMT, David Holmes wrote: >> Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove extra trailing new line > > That looks fine to me. > > Thanks @dholmes-ora @mhaessig @stefank @coleenp Thanks for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25968#issuecomment-3012077468 From duke at openjdk.org Fri Jun 27 07:55:40 2025 From: duke at openjdk.org (duke) Date: Fri, 27 Jun 2025 07:55:40 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 08:13:43 GMT, Qizheng Xing wrote: >> Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. > > Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: > > Remove extra trailing new line @MaxXSoft Your change (at version 75d90ef0a0c1d2dae32ab232f3ba62789f83c56f) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25968#issuecomment-3012078602 From stefank at openjdk.org Fri Jun 27 08:19:53 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 27 Jun 2025 08:19:53 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: <5Wq72BhUKnov6P_qLPqGgeTgdmkmdRSWxAVIf2gO51g=.b0a65116-aa24-4504-b916-596d6a3a0ec9@github.com> On Wed, 25 Jun 2025 08:13:43 GMT, Qizheng Xing wrote: >> Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. > > Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: > > Remove extra trailing new line I don't think this needs to be backported and if it does we can backport the entire patch (or split out the necessary parts). I'll sponsor this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25968#issuecomment-3012143578 From qxing at openjdk.org Fri Jun 27 08:19:54 2025 From: qxing at openjdk.org (Qizheng Xing) Date: Fri, 27 Jun 2025 08:19:54 GMT Subject: Integrated: 8360474: Add missing include guards for some HotSpot headers In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 06:44:31 GMT, Qizheng Xing wrote: > Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. This pull request has now been integrated. Changeset: aa26cede Author: Qizheng Xing Committer: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/aa26cede635011f5cc075cd528934ce8d8e8eef9 Stats: 20 lines in 4 files changed: 18 ins; 0 del; 2 mod 8360474: Add missing include guards for some HotSpot headers Reviewed-by: mhaessig, stefank, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25968 From qxing at openjdk.org Fri Jun 27 08:26:47 2025 From: qxing at openjdk.org (Qizheng Xing) Date: Fri, 27 Jun 2025 08:26:47 GMT Subject: RFR: 8360474: Add missing include guards for some HotSpot headers [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 08:13:43 GMT, Qizheng Xing wrote: >> Some HotSpot header files are missing include guards, which may cause compilation errors if these files are included multiple times. This patch adds include guards for them. > > Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision: > > Remove extra trailing new line > I don't think this needs to be backported and if it does we can backport the entire patch (or split out the necessary parts). I'll sponsor this. /sponsor Got it, thank you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25968#issuecomment-3012165762 From jbhateja at openjdk.org Fri Jun 27 08:54:23 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 27 Jun 2025 08:54:23 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v4] In-Reply-To: References: Message-ID: > Intel@ AVX10 ISA [1] extensions added new floating point MIN/MAX instructions which comply with definitions in IEEE-754-2019 standard section 9.6 and can directly emulate Math.min/max semantics without the need for any special handling for NaN, +0.0 or -0.0 detection. > > **The following pseudo-code describes the existing algorithm for min/max[FD]:** > > Move the non-negative value to the second operand; this will ensure that we correctly handle 0.0 and -0.0 values, if values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. Existing MINPS and MAXPS semantics only check for NaN as the second operand; hence, we need special handling to check for NaN at the first operand. > > btmp = (b < +0.0) ? a : b > atmp = (b < +0.0) ? b : a > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > For min[FD] we need a small tweak in the above algorithm, i.e., move the non-negative value to the first operand, this will ensure that we correctly select -0.0 if both the operands being compared are 0.0 or -0.0. > > btmp = (b < +0.0) ? b : a > atmp = (b < +0.0) ? a : b > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > Thus, we need additional special handling for NaNs and +/-0.0 to compute floating-point min/max values to comply with the semantics of Math.max/min APIs using existing MINPS / MAXPS instructions. AVX10.2 added a new instruction, VPMINMAX[SH,SS,SD]/[PH,PS,PD], which comprehensively handles special cases, thereby eliminating the need for special handling. > > Patch emits new instructions for reduction and non-reduction operations for single, double, and Float16 type. > > Kindly review and share your feedback. > > Best Regards, > Jatin > > [1] https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25914/files - new: https://git.openjdk.org/jdk/pull/25914/files/382c9b9e..89697983 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=02-03 Stats: 31 lines in 5 files changed: 14 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/25914.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25914/head:pull/25914 PR: https://git.openjdk.org/jdk/pull/25914 From jsjolen at openjdk.org Fri Jun 27 09:12:40 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 27 Jun 2025 09:12:40 GMT Subject: RFR: 8314488: Compiling the JDK with C++17 In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 02:01:55 GMT, Kim Barrett wrote: > I'm hijacking the PR mechanism as a way to discuss new C++17 features that can > be more easily structured and captured than bare email. Once discussion > settles down I'll turn the results into HotSpot Style Guide changes. I don't > intend to integrate any version of this document to the OpenJDK repository. Are we supposed to put our comments in this PR :-)? `constexpr if` statements would be nice to have. Seldom used, but the alternative is even more 'magical looking'. >From Julian's linked PR, apparently AIX's C++ compiler is capable of compiling C++17. Do we have any other "obscure" platform which the community is interested in supporting? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25992#issuecomment-3012289526 PR Comment: https://git.openjdk.org/jdk/pull/25992#issuecomment-3012292934 From kevinw at openjdk.org Fri Jun 27 09:37:41 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 27 Jun 2025 09:37:41 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Wed, 25 Jun 2025 13:02:03 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - comment update > - comment update I was reproducing this frequently, monitoring with asserts in a fastdebug build and problems started with ThreadSnapshotFactory::get_thread_snapshot() getting a null from JNIHandles::resolve(jthread) ...there are several different crashes in the product build. > But _thread_h() has already been used a number of times before we get here and if it were null we should have crashed long ago. ??? There can be some that don't cause a problem, like: java_lang_VirtualThread::is_instance(_thread_h()); (includes null check) ..and others are not called. Hmm maybe there are some that look like they should have crashed, e.g. 1290 _thread_name = OopHandle(oop_storage(), java_lang_Thread::name(_thread_h())); <-- name does: return java_thread->obj_field(_name_offset); ...I don't see why this didn't fault in the report from the JBS issue I was interpreting here (not my debug build). Reordered or something else happened, or just haven't understood enough. It is much easier to read an assert in get_thread_snapshot than letting it continue and crash in vframestream etc... But null from JNIHandles::resolve(jthread) is the earliest problem I found. I'm redoing with the cv_internal_thread_to_JavaThread usage... A little concerned that ThreadsListHandle::cv_internal_thread_to_JavaThread takes jobject jthread, our ref to a java.lang.Thread, and uses also calls 811 oop thread_oop = JNIHandles::resolve_non_null(jthread); ...which asserts if contains null, but maybe I don't know all the ThreadsListHandle magic. I had a day yesterday where the problem would not reproduce at all, which made it hard to verify! Will update... ------------- PR Comment: https://git.openjdk.org/jdk/pull/25958#issuecomment-3012360012 From duke at openjdk.org Fri Jun 27 10:12:50 2025 From: duke at openjdk.org (Anton Artemov) Date: Fri, 27 Jun 2025 10:12:50 GMT Subject: RFR: 8284016: Normalize handshake closure names Message-ID: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Hi, please consider the following changes: There are many classes inherited from the `HandshakeClosure` class, but they do not follow the same naming convention. In this PR we address this issue, all names are normalized in the following way: `XXXDummyClassNameClosure -> XXXDummyClassNameHandshakeClosure` or `XXXDummyClassNameHandshake -> XXXDummyClassNameHandshakeClosure` or `XXXStrangeClassName -> SomewhatSimilarNameHandshakeClosure` Tested in GHA and tiers 1 - 3. ------------- Commit messages: - 8284016: Fixed ALotOfHandshakeClosure name - 8284016: Normalized names of classes derived from HandshakeClosure Changes: https://git.openjdk.org/jdk/pull/26014/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26014&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8284016 Stats: 165 lines in 25 files changed: 0 ins; 0 del; 165 mod Patch: https://git.openjdk.org/jdk/pull/26014.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26014/head:pull/26014 PR: https://git.openjdk.org/jdk/pull/26014 From shade at openjdk.org Fri Jun 27 10:34:50 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 27 Jun 2025 10:34:50 GMT Subject: RFR: 8360867: CTW: Disable inline cache verification Message-ID: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> In CTW profiling, I noticed we spend a lot of time doing inline cache verification when nmethods are unloaded. Due to the nature of CTW, we unload _a lot_ of nmethods. Since the goal for CTW is to stress the compilers themselves, not inline caches in particular (I assume those are blank even, given almost no real code is executed), it makes sense to disable that verification for CTW. A taste of performance improvement, about 2%: $ time CONF=linux-x86_64-server-fastdebug make test TEST=applications/ctw/modules # Current real 5m1.616s user 79m41.398s sys 14m39.607s # No verify inline caches real 4m52.239s user 77m41.886s sys 14m25.352s Additional testing: - [x] Linux x86_64 server {fastdebug,release}, `applications/ctw/modules` ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/26016/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26016&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360867 Stats: 6 lines in 3 files changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26016.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26016/head:pull/26016 PR: https://git.openjdk.org/jdk/pull/26016 From tschatzl at openjdk.org Fri Jun 27 11:02:40 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 27 Jun 2025 11:02:40 GMT Subject: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v40] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that implements (currently Draft) JEP: G1: Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP process is already taking very long with no end in sight but we would like to have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble Parallel GC's as described in the JEP. The reason is that G1 lacks in throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers (dirty card queues - dcq) containing the location of dirtied cards. Refinement threads pick up their contents to re-refine. The barrier needs to enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad; // synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease throughput by 10-20% ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization between refinement and mutator threads, but coarse grained based on atomically switching card tables. Mutators only work on the "primary" card table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 56 commits: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * ayang review: remove sweep_epoch - Merge branch 'master' into card-table-as-dcq-merge - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * ayang review (part 2 - yield duration changes) - * ayang review (part 1) - * indentation fix - * remove support for 32 bit x86 in the barrier generation code, following latest changes from @shade - ... and 46 more: https://git.openjdk.org/jdk/compare/aa26cede...750ed2d0 ------------- Changes: https://git.openjdk.org/jdk/pull/23739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=39 Stats: 7085 lines in 111 files changed: 2568 ins; 3599 del; 918 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739 From coleenp at openjdk.org Fri Jun 27 11:24:59 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 27 Jun 2025 11:24:59 GMT Subject: RFR: 8268406: Deallocate jmethodID native memory [v12] In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 18:01:58 GMT, Coleen Phillimore wrote: >> This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. >> >> The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. >> >> Tested with tier1-4, 5-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Improved comments, not too wide for my screen... Thank you for the reviews, Serguei, David, Dan, Axel and Erik. And all the discussion, and the idea itself. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25267#issuecomment-3012657567 From coleenp at openjdk.org Fri Jun 27 11:25:00 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 27 Jun 2025 11:25:00 GMT Subject: Integrated: 8268406: Deallocate jmethodID native memory In-Reply-To: References: Message-ID: On Fri, 16 May 2025 12:18:42 GMT, Coleen Phillimore wrote: > This change uses a ConcurrentHashTable to associate Method* with jmethodID, instead of an indirection. JNI is deprecated in favor of using Panama to call methods, so I don't think we're concerned about JNI performance going forward. JVMTI uses a lot of jmethodIDs but there aren't any performance tests for JVMTI, but running vmTestbase/nsk/jvmti with in product build with and without this change had no difference in time. > > The purpose of this change is to remove the memory leak when you unload classes: we were leaving the jmethodID memory just in case JVMTI code still had references to that jmethodID and instead of crashing, should get nullptr. With this change, if JVMTI looks up a jmethodID, we've removed it from the table and will return nullptr. Redefinition and the InstanceKlass::_jmethod_method_ids is somewhat complicated. When a method becomes "obsolete" in redefinition, which means that the code in the method is changed, afterward creating a jmethodID from an "obsolete" method will create a new entry in the InstanceKlass table. This mechanism increases the method_idnum to do this. In the future maybe we could throw NoSuchMethodError if you try to create a jmethodID out of an obsolete method and remove all this code. But that's not in this change. > > Tested with tier1-4, 5-7. This pull request has now been integrated. Changeset: d8f9b188 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/d8f9b188fa488c9c6e343c62a148cfe9fc8a563b Stats: 695 lines in 16 files changed: 400 ins; 239 del; 56 mod 8268406: Deallocate jmethodID native memory Reviewed-by: dholmes, sspitsyn, dcubed, eosterlund, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/25267 From asemenov at openjdk.org Fri Jun 27 12:08:43 2025 From: asemenov at openjdk.org (Artem Semenov) Date: Fri, 27 Jun 2025 12:08:43 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v2] In-Reply-To: References: Message-ID: On Fri, 27 Jun 2025 05:22:00 GMT, Serguei Spitsyn wrote: > > At most I would add an assert, but generally my understanding is that the user of any closure has the responsibility of passing it valid input. > > Adding asserts sounds like a good suggestion. It seems to me that this won?t be a big problem in this form. I?ve just moved the existing check higher up, where it will prevent dereferencing a null pointer. However, if you confirm that this is not acceptable, I will replace the check with assert. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26002#issuecomment-3012790447 From kevinw at openjdk.org Fri Jun 27 12:22:21 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 27 Jun 2025 12:22:21 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v6] In-Reply-To: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: > ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: Test requires: permit linux debug testing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25958/files - new: https://git.openjdk.org/jdk/pull/25958/files/d8143785..d14f5228 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=04-05 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25958.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25958/head:pull/25958 PR: https://git.openjdk.org/jdk/pull/25958 From alanb at openjdk.org Fri Jun 27 12:27:44 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 27 Jun 2025 12:27:44 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v6] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Fri, 27 Jun 2025 12:22:21 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > Test requires: permit linux debug testing test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWithEliminatedLock.java line 30: > 28: * an object that is scalar replaced > 29: * @requires vm.compMode != "Xcomp" > 30: * @requires !vm.debug | (os.family == "linux") Is this a left over from your local testing? There is nothing Linux specific here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2171915162 From mdoerr at openjdk.org Fri Jun 27 13:02:47 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 27 Jun 2025 13:02:47 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: <2hLKCRKzNs19ZW_ntM7yJ2ynW0Hj7SwBrN9hlcOSxTM=.555bb43a-8fb4-4157-9cdb-a18b28178932@github.com> References: <2hLKCRKzNs19ZW_ntM7yJ2ynW0Hj7SwBrN9hlcOSxTM=.555bb43a-8fb4-4157-9cdb-a18b28178932@github.com> Message-ID: On Tue, 24 Jun 2025 16:13:26 GMT, Dean Long wrote: >> Just FYI: My local tier1-3 test on linux-riscv64 is good. And I didn't witness an obvious change on specjbb performance with g1gc. > > Thanks @RealFYang. @dean-long: Are you planning to do a jdk25 backport? We still see the crashes, there. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-3013002469 From kbarrett at openjdk.org Fri Jun 27 14:06:43 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 27 Jun 2025 14:06:43 GMT Subject: RFR: 8314488: Compiling the JDK with C++17 In-Reply-To: References: Message-ID: On Fri, 27 Jun 2025 09:08:29 GMT, Johan Sj?len wrote: > Are we supposed to put our comments in this PR :-)? Yes. That's kind of the point of using a PR. > `constexpr if` statements would be nice to have. Seldom used, but the alternative is even more 'magical looking'. See item 10. Compile-time If, which is currently in the permitted block. > From Julian's linked PR, apparently AIX's C++ compiler is capable of compiling C++17. Do we have any other "obscure" platform which the community is interested in supporting? None that I know about. That's part of the point of asking the question about whether we should throw the switch at all and start building with and using C++17. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25992#issuecomment-3013193894 PR Comment: https://git.openjdk.org/jdk/pull/25992#issuecomment-3013198408 From kvn at openjdk.org Fri Jun 27 15:02:19 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 27 Jun 2025 15:02:19 GMT Subject: RFR: 8360867: CTW: Disable inline cache verification In-Reply-To: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> References: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> Message-ID: <9PT2AJVcZVpkCZb5MIHbVz20thowb5XRp2KJn6-s2ok=.6c2b483e-6952-4750-8c33-a9a43cc249e5@github.com> On Fri, 27 Jun 2025 10:30:30 GMT, Aleksey Shipilev wrote: > In CTW profiling, I noticed we spend a lot of time doing inline cache verification when nmethods are unloaded. Due to the nature of CTW, we unload _a lot_ of nmethods. Since the goal for CTW is to stress the compilers themselves, not inline caches in particular (I assume those are blank even, given almost no real code is executed), it makes sense to disable that verification for CTW. > > A taste of performance improvement, about 2%: > > > $ time CONF=linux-x86_64-server-fastdebug make test TEST=applications/ctw/modules > > # Current > real 5m1.616s > user 79m41.398s > sys 14m39.607s > > # No verify inline caches > real 4m52.239s > user 77m41.886s > sys 14m25.352s > > > Additional testing: > - [x] Linux x86_64 server {fastdebug,release}, `applications/ctw/modules` Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26016#pullrequestreview-2966883772 From kevinw at openjdk.org Fri Jun 27 15:02:19 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 27 Jun 2025 15:02:19 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v6] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: <3feC9uyBs_fTB_iNDZ1ParDIIxUH1-XJkrznAGrULKo=.221deff1-e3c4-4a49-bf5b-28bea75a88f5@github.com> On Fri, 27 Jun 2025 12:24:50 GMT, Alan Bateman wrote: >> Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: >> >> Test requires: permit linux debug testing > > test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWithEliminatedLock.java line 30: > >> 28: * an object that is scalar replaced >> 29: * @requires vm.compMode != "Xcomp" >> 30: * @requires !vm.debug | (os.family == "linux") > > Is this a left over from your local testing? There is nothing Linux specific here. I saw the test timeout in debug builds on win and mac (as expected). On Linux, fastdebug builds run OK, take about 1m30 in CI and could be useful. If further testing doesn't find a timeout issue, I'd like to leave this in, to not exclude linux debug builds. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2172202952 From alanb at openjdk.org Fri Jun 27 15:02:20 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 27 Jun 2025 15:02:20 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v6] In-Reply-To: <3feC9uyBs_fTB_iNDZ1ParDIIxUH1-XJkrznAGrULKo=.221deff1-e3c4-4a49-bf5b-28bea75a88f5@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <3feC9uyBs_fTB_iNDZ1ParDIIxUH1-XJkrznAGrULKo=.221deff1-e3c4-4a49-bf5b-28bea75a88f5@github.com> Message-ID: On Fri, 27 Jun 2025 14:44:08 GMT, Kevin Walls wrote: >> test/jdk/com/sun/management/HotSpotDiagnosticMXBean/DumpThreadsWithEliminatedLock.java line 30: >> >>> 28: * an object that is scalar replaced >>> 29: * @requires vm.compMode != "Xcomp" >>> 30: * @requires !vm.debug | (os.family == "linux") >> >> Is this a left over from your local testing? There is nothing Linux specific here. > > I saw the test timeout in debug builds on win and mac (as expected). > > On Linux, fastdebug builds run OK, take about 1m30 in CI and could be useful. If further testing doesn't find a timeout issue, I'd like to leave this in, to not exclude linux debug builds. There is nothing Linux in this change or test so I think drop the change to the test and create a separate for the timeout you saw. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2172210930 From sgehwolf at openjdk.org Fri Jun 27 15:04:57 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 27 Jun 2025 15:04:57 GMT Subject: RFR: 8360651: Create OSContainer API for memory limit Message-ID: Please review this small addition to add a new `OSContainer::has_memory_limit()` API (Linux only - as with the entire OSContainer API) in preparation for [JDK-8350596](https://bugs.openjdk.org/browse/JDK-8350596) which proposes to increase the default `MaxRAMPercentage` when this new API returns true. The patch is pretty trivial. It's only the testing which amounts to the most lines in this patch. Testing: - [ ] GHA (still running) - [x] Hotspot container tests on x86_64 Linux on cgroup v1 and cgroup v2 (including the new tests). Thoughts? ------------- Commit messages: - MemoryLimitTest whitespace fixes. - TestContainerMemory whitespace fixes. - 8360651: Create OSContainer API for memory limit Changes: https://git.openjdk.org/jdk/pull/26020/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26020&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360651 Stats: 272 lines in 11 files changed: 269 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26020.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26020/head:pull/26020 PR: https://git.openjdk.org/jdk/pull/26020 From tschatzl at openjdk.org Fri Jun 27 15:33:44 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 27 Jun 2025 15:33:44 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v5] In-Reply-To: References: Message-ID: <9v1jBPYDFtMgYobu23T98bBXDSXRks8QJg67QB-z7K4=.1cd915ad-2ada-4282-a14e-e44f04af751c@github.com> On Thu, 26 Jun 2025 13:28:13 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Reviews Changes requested by tschatzl (Reviewer). src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 104: > 102: } > 103: > 104: // Computes a smooth scaling factor based on the relative deviation of observed gc_cpu_usage Typically the code uses "actual" instead of "observed". There are also a few "current" `gc_cpu_usage`thrown in. If possible, it would be nice to harmonize usage in the documentation. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 129: > 127: // > 128: // This helps avoid overreacting to small gc_cpu_usage deviations but respond appropriately > 129: // when necessary. This sentence seems to be a repeat of the one above ("This ensures appropriate heap resizing when deviations become significant, while avoiding overreacting to minor deviations.") I would remove the first occurrence (maybe keeping the first version). src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 151: > 149: size_t uncommitted_bytes = reserved_bytes - committed_bytes; > 150: size_t expand_bytes_via_pct = > 151: uncommitted_bytes * G1ExpandByPercentOfAvailable / 100; I think this linebreak is unnecessary, feel free to keep though. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 190: > 188: uint target_regions_to_shrink = _g1h->num_free_regions(); > 189: > 190: uint reserve_regions = ceil(_g1h->num_committed_regions() * G1ReservePercent / 100.0); This is unused except for the log message. I think we once discussed this value, and removed its use because we did not know its reason. It should be removed completely, even from the log message. src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 268: > 266: > 267: log_debug(gc, ergo, heap)("Heap triggers: pauses-since-start: %u num-prev-pauses-for-heuristics: %u GC CPU usage deviation counter: %d", > 268: _recent_cpu_usage_deltas.num(), long_term_count_limit(), _gc_cpu_usage_deviation_counter); `pauses-since-start` is a misnomer, it's how many deltas were collectors; the second is a maximum (maybe print that once as precious log among other relevant information for this kind of ergonomics?). src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 49: > 47: // If below that range, we decrement that counter, if above, we increment it. > 48: // The intent of this mechanism is to filter short term events because heap sizing has > 49: // some overhead. I think that sentence should move just before the full collection handling description. src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 54: > 52: // if that counter reaches -G1CPUUsageShrinkThreshold we consider shrinking the heap. > 53: // > 54: // While doing so, we accumulate the relative difference to the gc_cpu_usage_target `gc_cpu_usage_target` has not been defined before, although above mentions "the target GC CPU usage". It seems better to just use the words here too. src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 64: > 62: // Long term behavior is solely managed by regularly comparing actual long term > 63: // GC CPU usage with the boundaries of above range in regular long term intervals. > 64: // If current long term GC CPU usage is outside, expand or shrink respectively. Suggestion: // If current long term GC CPU usage is different to the target, expand or shrink respectively. src/hotspot/share/gc/g1/jvmFlagConstraintsG1.cpp line 215: > 213: } > 214: > 215: JVMFlag::Error gc_cpu_usage_threshold_healper(JVMFlagsEnum flagid, Suggestion: JVMFlag::Error gc_cpu_usage_threshold_helper(JVMFlagsEnum flagid, ------------- PR Review: https://git.openjdk.org/jdk/pull/25832#pullrequestreview-2962629782 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172058191 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172065270 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172066367 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172071149 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172080088 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172040409 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172043739 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2172045132 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2169389589 From kevinw at openjdk.org Fri Jun 27 15:35:04 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 27 Jun 2025 15:35:04 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v6] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <3feC9uyBs_fTB_iNDZ1ParDIIxUH1-XJkrznAGrULKo=.221deff1-e3c4-4a49-bf5b-28bea75a88f5@github.com> Message-ID: On Fri, 27 Jun 2025 14:48:32 GMT, Alan Bateman wrote: >> I saw the test timeout in debug builds on win and mac (as expected). >> >> On Linux, fastdebug builds run OK, take about 1m30 in CI and could be useful. If further testing doesn't find a timeout issue, I'd like to leave this in, to not exclude linux debug builds. > > There is nothing Linux in this change or test so I think drop the change to the test and create a separate for the timeout you saw. Not sure I'm understanding. This test chooses to rule out running with all debug builds, because it causes timeouts. I only saw timeouts when I forced it to run in a debug build (win+mac), which are not currently run by choice. But I'm saying I'm having good results on Linux, so we could weaken that rule, and permit debug builds on Linux. I can add a comment in the test definition to explain. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2172263873 From alanb at openjdk.org Fri Jun 27 15:35:05 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 27 Jun 2025 15:35:05 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v6] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <3feC9uyBs_fTB_iNDZ1ParDIIxUH1-XJkrznAGrULKo=.221deff1-e3c4-4a49-bf5b-28bea75a88f5@github.com> Message-ID: On Fri, 27 Jun 2025 15:14:36 GMT, Kevin Walls wrote: >> There is nothing Linux in this change or test so I think drop the change to the test and create a separate for the timeout you saw. > > Not sure I'm understanding. > This test chooses to rule out running with all debug builds, because it causes timeouts. > I only saw timeouts when I forced it to run in a debug build (win+mac), which are not currently run by choice. > But I'm saying I'm having good results on Linux, so we could weaken that rule, and permit debug builds on Linux. > I can add a comment in the test definition to explain. We added this test that the thread dump can handle eliminated locks. It's not easy to test so deliberately limited to release builds and not -Xcomp. It would require more work on the test to have it be reliable on a wider set of configurations and builds. There is nothing linux specific. So I think drop the test change from this PR, and create a separate issue to work on a better test for this scenario. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2172294845 From kevinw at openjdk.org Fri Jun 27 16:00:56 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 27 Jun 2025 16:00:56 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v6] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <3feC9uyBs_fTB_iNDZ1ParDIIxUH1-XJkrznAGrULKo=.221deff1-e3c4-4a49-bf5b-28bea75a88f5@github.com> Message-ID: On Fri, 27 Jun 2025 15:32:52 GMT, Alan Bateman wrote: >> Not sure I'm understanding. >> This test chooses to rule out running with all debug builds, because it causes timeouts. >> I only saw timeouts when I forced it to run in a debug build (win+mac), which are not currently run by choice. >> But I'm saying I'm having good results on Linux, so we could weaken that rule, and permit debug builds on Linux. >> I can add a comment in the test definition to explain. > > We added this test that the thread dump can handle eliminated locks. It's not easy to test so deliberately limited to release builds and not -Xcomp. It would require more work on the test to have it be reliable on a wider set of configurations and builds. There is nothing linux specific. So I think drop the test change from this PR, and create a separate issue to work on a better test for this scenario. Sure then yes it's a temporary leftover from local testing. 8-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2172314157 From iklam at openjdk.org Fri Jun 27 16:01:19 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 27 Jun 2025 16:01:19 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added [v2] In-Reply-To: References: Message-ID: > Background: when writing the string table in the AOT cache, we do this: > > 1. Find out the number of strings in the interned string table > 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. > 3. Enter safepoint > 4. Copy the strings into the arrays > > This bug happened because: > > - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` > - JIT compiler threads may create more interned strings after step 1 > > This PR attempts to fix both issues. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into 8358680-aot-cache-creation-fails-with-no-strings-should-have-been-added - @coleenp comment: change items_count() to items_count_acquire() - 8358680: AOT cache creation fails: no strings should have been added ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25816/files - new: https://git.openjdk.org/jdk/pull/25816/files/de7ce83f..94a64f97 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25816&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25816&range=00-01 Stats: 22476 lines in 718 files changed: 10749 ins; 7865 del; 3862 mod Patch: https://git.openjdk.org/jdk/pull/25816.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25816/head:pull/25816 PR: https://git.openjdk.org/jdk/pull/25816 From iklam at openjdk.org Fri Jun 27 16:01:20 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 27 Jun 2025 16:01:20 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: <1NT3dmBpabeSJ0HMglupev2ONGKaMH9XuMKZYiBwqZw=.32eb98c0-2ed1-46ef-b4b1-166e7d3f791d@github.com> References: <1NT3dmBpabeSJ0HMglupev2ONGKaMH9XuMKZYiBwqZw=.32eb98c0-2ed1-46ef-b4b1-166e7d3f791d@github.com> Message-ID: <4oWX4_7h0MpdAqosA8SnoT8wgaNEXs4oshRV6KETRVA=.90cd9aea-d4b5-4507-b24a-5367db8dc7a5@github.com> On Wed, 18 Jun 2025 17:27:29 GMT, Aleksey Shipilev wrote: > I still dislike hooking up to compiler infrastructure to figure out if something is adding interned strings. I really, really dislike the divergence we would introduce with JDK 25 -> JDK 26 once a variant of [JDK-8357473](https://bugs.openjdk.org/browse/JDK-8357473) lands in mainline. I cannot yet think of better solution though, let me think about it some more. At very least we need to get the sequencing of patches right... As we discussed off-line, I will push this to mainline and backport to 25, and then @shipilev make his changes on top of this, so that they can be easily backported to 25 if necessary. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-3013515543 From iklam at openjdk.org Fri Jun 27 16:01:22 2025 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 27 Jun 2025 16:01:22 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added [v2] In-Reply-To: References: Message-ID: On Tue, 17 Jun 2025 17:03:48 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into 8358680-aot-cache-creation-fails-with-no-strings-should-have-been-added >> - @coleenp comment: change items_count() to items_count_acquire() >> - 8358680: AOT cache creation fails: no strings should have been added > > src/hotspot/share/classfile/stringTable.cpp line 351: > >> 349: } >> 350: >> 351: size_t StringTable::items_count() { > > I think there's a convention to make accessor functions that use acquire semantics to be named items_count_acquire(). I changed to items_count_acquire(). > src/hotspot/share/classfile/stringTable.cpp line 970: > >> 968: // This flag will be cleared after intern table dumping has completed, so we can run the >> 969: // compiler again (for future AOT method compilation, etc). >> 970: DEBUG_ONLY(Atomic::release_store(&_disable_interning_during_cds_dump, 1)); > > I think atomics work with bool or is this a refcount ? I tried to changed to a `bool` but Atomics doesn't like that. I got an compilation error. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25816#discussion_r2172319083 PR Review Comment: https://git.openjdk.org/jdk/pull/25816#discussion_r2172318768 From sparasa at openjdk.org Fri Jun 27 16:34:11 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 27 Jun 2025 16:34:11 GMT Subject: RFR: 8360775: Fix Shenandoah GC test failures when APX is enabled Message-ID: <66o1iImVgzmTapY0AEGZeAg_VTj4ZbRc1MSFvgA8qYk=.ab11bc1a-3cb0-4832-82da-4e97ee8aaf9b@github.com> This PR fixes the test failures seen in many JTreg tests related to Shenandoah GC (`test/hotspot/jtreg/gc/shenandoah/`) with UseAPX. The issues were root caused to: 1. Higher band registers are not saved and restored in Shenandoah load_reference_barrier. 2. Pusha/Popa implementation using push2p/pop2p Both the issues are fixed in this PR. ------------- Commit messages: - 8360775: Fix Shenandoah GC test failures when APX is enabled Changes: https://git.openjdk.org/jdk/pull/26009/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26009&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360775 Stats: 109 lines in 2 files changed: 59 ins; 0 del; 50 mod Patch: https://git.openjdk.org/jdk/pull/26009.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26009/head:pull/26009 PR: https://git.openjdk.org/jdk/pull/26009 From coleenp at openjdk.org Fri Jun 27 16:55:39 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 27 Jun 2025 16:55:39 GMT Subject: RFR: 8284016: Normalize handshake closure names In-Reply-To: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Message-ID: <7RFaowLX2qq3wCI-adTUMmf6QQt4AjF8ocLkwyNHVJg=.a4e02678-9114-4aef-a4a7-50508d60d736@github.com> On Fri, 27 Jun 2025 09:10:26 GMT, Anton Artemov wrote: > Hi, please consider the following changes: > > There are many classes inherited from the `HandshakeClosure` class, but they do not follow the same naming convention. In this PR we address this issue, all names are normalized in the following way: > > `XXXDummyClassNameClosure -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXDummyClassNameHandshake -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXStrangeClassName -> SomewhatSimilarNameHandshakeClosure` > > Tested in GHA and tiers 1 - 3. I found a couple of places to realign the parameters, but otherwise this looks good. I like the new naming conventions. We have a lot of handshakes now! Were you able to build shenandoah (not built by default, need to add --enable-jvm-feature-shenandoahgc to configure)? src/hotspot/share/prims/jvmtiEnvBase.hpp line 629: > 627: public: > 628: GetCurrentContendedMonitorHandshakeClosure(JvmtiEnv *env, > 629: JavaThread* calling_thread, Can you realign these parameters? src/hotspot/share/prims/jvmtiEnvBase.hpp line 650: > 648: public: > 649: GetStackTraceHandshakeClosure(JvmtiEnv *env, jint start_depth, jint max_count, > 650: jvmtiFrameInfo* frame_buffer, jint* count_ptr) realign parameters. src/hotspot/share/prims/jvmtiEnvBase.hpp line 760: > 758: public: > 759: GetSingleStackTraceHandshakeClosure(JvmtiEnv *env, JavaThread *calling_thread, > 760: jthread thread, jint max_frame_count) Also realign parameters. src/hotspot/share/prims/jvmtiEnvBase.hpp line 798: > 796: public: > 797: GetFrameLocationHandshakeClosure(JvmtiEnv *env, jint depth, > 798: jmethodID* method_ptr, jlocation* location_ptr) Also realign parameters. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26014#pullrequestreview-2967227483 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2172435715 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2172435267 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2172434071 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2172433356 From iwalulya at openjdk.org Fri Jun 27 16:56:17 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 27 Jun 2025 16:56:17 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v6] In-Reply-To: References: Message-ID: > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: Thomas Review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25832/files - new: https://git.openjdk.org/jdk/pull/25832/files/8781a113..1a566b0b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=04-05 Stats: 40 lines in 3 files changed: 5 ins; 12 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From cjplummer at openjdk.org Fri Jun 27 17:21:44 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 27 Jun 2025 17:21:44 GMT Subject: RFR: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread [v3] In-Reply-To: <0_4zYtZFNx5QA5h_4sQsF1dFV8Zr8dPZZHfKk-UuGRk=.acc9c10c-e720-4101-9e50-5a8edff6035b@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> <0_4zYtZFNx5QA5h_4sQsF1dFV8Zr8dPZZHfKk-UuGRk=.acc9c10c-e720-4101-9e50-5a8edff6035b@github.com> Message-ID: On Wed, 25 Jun 2025 17:31:47 GMT, Chris Plummer wrote: >> Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). >> >> I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). >> >> Testing (in progress): >> >> - [x] tier1 ci >> - [x] tier1 ci with -XX:StartFlightRecording >> - [x] tier5 ci > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > add missing space Thank you for the reviews Serguei, David, and Kevin! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25960#issuecomment-3013850087 From cjplummer at openjdk.org Fri Jun 27 17:21:44 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 27 Jun 2025 17:21:44 GMT Subject: Integrated: 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread In-Reply-To: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> References: <3XshRFVcxP5VLYu_LNt0qLkXWlxpJREbVreCHZis3Eo=.f826f8cb-8042-42b5-8dd8-e6d5c8f9be39@github.com> Message-ID: <-pEl3w5pxWYKayDEEwZL4NEVAcOnraU72Yh2X7oJoT4=.9d60ee34-688e-48d0-8e86-60afd89e6506@github.com> On Tue, 24 Jun 2025 21:15:06 GMT, Chris Plummer wrote: > Update SA to know about JfrRecorderThread, which was made a JavaThread in JDK 25 by [JDK-8352251](https://bugs.openjdk.org/browse/JDK-8352251). > > I'm also fixing ClhsdbJstackWithConcurrentLock, which was also failing with JFR enabled, but for a different reason (specified heap size was too small). > > Testing (in progress): > > - [x] tier1 ci > - [x] tier1 ci with -XX:StartFlightRecording > - [x] tier5 ci This pull request has now been integrated. Changeset: 712d866b Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/712d866b72b43c839c57c3303dfb215f94c0db3b Stats: 25 lines in 5 files changed: 13 ins; 9 del; 3 mod 8360312: Serviceability Agent tests fail with JFR enabled due to unknown thread type JfrRecorderThread Reviewed-by: sspitsyn, kevinw, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/25960 From cjplummer at openjdk.org Fri Jun 27 17:31:38 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 27 Jun 2025 17:31:38 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v2] In-Reply-To: References: Message-ID: <0UUsjnKpCkk0hAUgXZfHR4Vtl2r9-JqY2hAGyNXRz3I=.a05cac71-5193-497e-9f2f-e5b1c65f22ee@github.com> On Fri, 27 Jun 2025 12:06:21 GMT, Artem Semenov wrote: > > > At most I would add an assert, but generally my understanding is that the user of any closure has the responsibility of passing it valid input. > > > > > > Adding asserts sounds like a good suggestion. > > It seems to me that this won?t be a big problem in this form. I?ve just moved the existing check higher up, where it will prevent dereferencing a null pointer. > > However, if you confirm that this is not acceptable, I will replace the check with assert. I think it is a matter of having the code accurately document the input requirements. Checking for null and returning makes it look like passing null is ok and might happen. That's not the case though. It should never happen and adding an assert properly documents this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26002#issuecomment-3013875492 From mhaessig at openjdk.org Fri Jun 27 18:01:58 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 27 Jun 2025 18:01:58 GMT Subject: RFR: 8308094: Add a compilation timeout flag to catch long running compilations Message-ID: This PR adds `-XX:CompileTaskTimeout` on Linux to limit the amount of time a compilation task can run. The goal of this is initially to be able to find and investigate long-running compilations. The timeout is implemented using a POSIX timer that sends a `SIGALRM` to the compiler thread the compile task is running on. Each compiler thread registers a signal handler that triggers an assert upon receiving `SIGALRM`. This is currently only implemented for Linux, because it relies on `SIGEV_THREAD_ID` to get the signal delivered to the same thread that timed out. Since `SIGALRM` is now used, the test `runtime/signal/TestSigalrm.java` now requires `vm.flagless` so it will not interfere with the compiler thread signal handlers. Testing: - [ ] Github Actions - [x] tier1, tier2 on all platforms - [x] tier3, tier4 and Oracle internal testing on Linux fastdebug - [x] tier1 through tier4 with `-XX:CompileTaskTimeout=60000` (one minute timeout) to see what fails (`compiler/codegen/TestAntiDependenciesHighMemUsage2.java`, `compiler/loopopts/TestMaxLoopOptsCountReached.java`, and `compiler/c2/TestScalarReplacementMaxLiveNodes.java` fail) ------------- Commit messages: - Fix SIGALRM test - Add timeout functionality to compiler threads Changes: https://git.openjdk.org/jdk/pull/26023/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26023&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308094 Stats: 141 lines in 5 files changed: 138 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26023.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26023/head:pull/26023 PR: https://git.openjdk.org/jdk/pull/26023 From sspitsyn at openjdk.org Fri Jun 27 19:05:38 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 27 Jun 2025 19:05:38 GMT Subject: RFR: 8284016: Normalize handshake closure names In-Reply-To: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Message-ID: On Fri, 27 Jun 2025 09:10:26 GMT, Anton Artemov wrote: > Hi, please consider the following changes: > > There are many classes inherited from the `HandshakeClosure` class, but they do not follow the same naming convention. In this PR we address this issue, all names are normalized in the following way: > > `XXXDummyClassNameClosure -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXDummyClassNameHandshake -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXStrangeClassName -> SomewhatSimilarNameHandshakeClosure` > > Tested in GHA and tiers 1 - 3. Changes requested by sspitsyn (Reviewer). src/hotspot/share/prims/jvmtiEnvBase.hpp line 511: > 509: }; > 510: > 511: class SetForceEarlyReturnHandshakeClosure : public JvmtiUnitedHandshakeClosure { I do not support this unification over JVMTI files. This make `HandshakeClosure` class names too long. The JVMTI has a consistent local naming convention to have the suffix `Closure` at the end instead of `HandshakeClosure`. And it is fine because normally there are no other kind of closures in JVMTI code. ------------- PR Review: https://git.openjdk.org/jdk/pull/26014#pullrequestreview-2967650296 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2172691167 From dcubed at openjdk.org Fri Jun 27 19:50:39 2025 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 27 Jun 2025 19:50:39 GMT Subject: RFR: 8284016: Normalize handshake closure names In-Reply-To: References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Message-ID: On Fri, 27 Jun 2025 19:02:46 GMT, Serguei Spitsyn wrote: >> Hi, please consider the following changes: >> >> There are many classes inherited from the `HandshakeClosure` class, but they do not follow the same naming convention. In this PR we address this issue, all names are normalized in the following way: >> >> `XXXDummyClassNameClosure -> XXXDummyClassNameHandshakeClosure` >> >> or >> >> `XXXDummyClassNameHandshake -> XXXDummyClassNameHandshakeClosure` >> >> or >> >> `XXXStrangeClassName -> SomewhatSimilarNameHandshakeClosure` >> >> Tested in GHA and tiers 1 - 3. > > src/hotspot/share/prims/jvmtiEnvBase.hpp line 511: > >> 509: }; >> 510: >> 511: class SetForceEarlyReturnHandshakeClosure : public JvmtiUnitedHandshakeClosure { > > I do not support this unification over JVMTI files. This make `HandshakeClosure` class names too long. > The JVMTI has a consistent local naming convention to have the suffix `Closure` at the end instead of `HandshakeClosure`. And it is fine because normally there are no other kind of closures in JVMTI code. Aren't there closures in the JVM/TI tag processing code? I could be remembering wrong... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2172762775 From amenkov at openjdk.org Fri Jun 27 20:24:39 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 27 Jun 2025 20:24:39 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> Message-ID: On Thu, 26 Jun 2025 00:38:17 GMT, David Holmes wrote: > > Line number info puts it in the _java_thread == null branch of: threadService.cpp > > 1317 vframeStream vfst(_java_thread != nullptr > > 1318 ? vframeStream(_java_thread, false, true, vthread_carrier) > > 1319 : vframeStream(java_lang_VirtualThread::continuation(_thread_h()))); <--- > > And it's looking inside the Handle _thread_h() within GetThreadSnapshotClosure which was setup by get_thread_snapshot, and it's a null pointer, > > But `_thread_h()` has already been used a number of times before we get here and if it were null we should have crashed long ago. ??? I believe null here is not result of `_thread_h()`, but is returned by `java_lang_VirtualThread::continuation(...)` because `_thread_h` is lava.lang.Thread object and not java.lang.VirtualThread. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25958#issuecomment-3014271659 From amenkov at openjdk.org Fri Jun 27 20:32:42 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 27 Jun 2025 20:32:42 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Fri, 27 Jun 2025 09:35:07 GMT, Kevin Walls wrote: > But null from JNIHandles::resolve(jthread) is the earliest problem I found. > > I'm redoing with the cv_internal_thread_to_JavaThread usage... > > A little concerned that ThreadsListHandle::cv_internal_thread_to_JavaThread takes jobject jthread, our ref to a java.lang.Thread, and uses also calls 811 oop thread_oop = JNIHandles::resolve_non_null(jthread); JNIHandles::resolve(jthread) can return null only if jthread == nullptr, this should not be possible ------------- PR Comment: https://git.openjdk.org/jdk/pull/25958#issuecomment-3014285704 From dlong at openjdk.org Fri Jun 27 21:22:57 2025 From: dlong at openjdk.org (Dean Long) Date: Fri, 27 Jun 2025 21:22:57 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: <2hLKCRKzNs19ZW_ntM7yJ2ynW0Hj7SwBrN9hlcOSxTM=.555bb43a-8fb4-4157-9cdb-a18b28178932@github.com> References: <2hLKCRKzNs19ZW_ntM7yJ2ynW0Hj7SwBrN9hlcOSxTM=.555bb43a-8fb4-4157-9cdb-a18b28178932@github.com> Message-ID: On Tue, 24 Jun 2025 16:13:26 GMT, Dean Long wrote: >> Just FYI: My local tier1-3 test on linux-riscv64 is good. And I didn't witness an obvious change on specjbb performance with g1gc. > > Thanks @RealFYang. > @dean-long: Are you planning to do a jdk25 backport? We still see the crashes, there. I was going to let it bake in jdk26 for a while before deciding. It seems a bit risky to me. I am leaning towards not backporting it to Oracle JDK. For OpenJDK 25, it might make more sense to do a PPC-specific fix like adding a NOP at the verified entry point. What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-3014307981 From mdoerr at openjdk.org Fri Jun 27 21:22:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 27 Jun 2025 21:22:57 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: <2hLKCRKzNs19ZW_ntM7yJ2ynW0Hj7SwBrN9hlcOSxTM=.555bb43a-8fb4-4157-9cdb-a18b28178932@github.com> Message-ID: <8NphJ347zBj0q5YG-K0Mq_vh9LPfhM_Jo-mcH94re3o=.0830a87e-7d9c-4907-86b2-c13a29893c7d@github.com> On Fri, 27 Jun 2025 20:43:30 GMT, Dean Long wrote: > > @dean-long: Are you planning to do a jdk25 backport? We still see the crashes, there. > > I was going to let it bake in jdk26 for a while before deciding. It seems a bit risky to me. I am leaning towards not backporting it to Oracle JDK. For OpenJDK 25, it might make more sense to do a PPC-specific fix like adding a NOP at the verified entry point. What do you think? I'm not convinced that only PPC64 is affected. [JDK-8258229](https://bugs.openjdk.org/browse/JDK-8258229) looks wrong for all platforms except x86 and may be even problematic on that platform as you had mentioned (due to NMethodState_lock). Would it make sense to backout JDK-8258229 in jdk25 and live with it? That issue doesn't look so critical. Or maybe guard the code with #ifdef x86? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-3014360978 From sparasa at openjdk.org Fri Jun 27 22:17:49 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 27 Jun 2025 22:17:49 GMT Subject: RFR: 8360776: Disable Intel APX by default and enable it only if requested by the user using -XX:+UnlockExperimentalVMOptions -XX:+UseAPX Message-ID: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Currently, APX is not enabled consistently between product and debug builds. If the hardware supports Intel APX: 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** ------------- Commit messages: - 8360776: Disable Intel APX by default and enable it only if requested by the user using -XX:+UnlockExperimentalVMOptions -XX:+UseAPX Changes: https://git.openjdk.org/jdk/pull/26029/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26029&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360776 Stats: 14 lines in 4 files changed: 0 ins; 12 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26029.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26029/head:pull/26029 PR: https://git.openjdk.org/jdk/pull/26029 From jbhateja at openjdk.org Sat Jun 28 08:05:42 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sat, 28 Jun 2025 08:05:42 GMT Subject: RFR: 8360776: Disable Intel APX by default and enable it only if requested by the user using -XX:+UnlockExperimentalVMOptions -XX:+UseAPX In-Reply-To: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Message-ID: On Fri, 27 Jun 2025 22:13:47 GMT, Srinivas Vamsi Parasa wrote: > Currently, APX is not enabled consistently between product and debug builds. > > If the hardware supports Intel APX: > > 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. > > 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. > > **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** Please change the title as "8360776: Enable -XX+UseAPX as Experiminatal feature in all builds" ------------- PR Comment: https://git.openjdk.org/jdk/pull/26029#issuecomment-3015086119 From jbhateja at openjdk.org Sat Jun 28 08:09:38 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sat, 28 Jun 2025 08:09:38 GMT Subject: RFR: 8360776: Disable Intel APX by default and enable it only if requested by the user using -XX:+UnlockExperimentalVMOptions -XX:+UseAPX In-Reply-To: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Message-ID: On Fri, 27 Jun 2025 22:13:47 GMT, Srinivas Vamsi Parasa wrote: > Currently, APX is not enabled consistently between product and debug builds. > > If the hardware supports Intel APX: > > 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. > > 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. > > **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp line 432: > 430: } > 431: > 432: #if defined(_LP64) Is it still required after the removal of the 32-bit port of x86? src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp line 258: > 256: } > 257: > 258: #if defined(_LP64) Do we still need this after removal of 32-bit port of x86 ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26029#discussion_r2173167666 PR Review Comment: https://git.openjdk.org/jdk/pull/26029#discussion_r2173167885 From jbhateja at openjdk.org Sat Jun 28 08:38:42 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sat, 28 Jun 2025 08:38:42 GMT Subject: RFR: 8360776: Disable Intel APX by default and enable it only if requested by the user using -XX:+UnlockExperimentalVMOptions -XX:+UseAPX In-Reply-To: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Message-ID: On Fri, 27 Jun 2025 22:13:47 GMT, Srinivas Vamsi Parasa wrote: > Currently, APX is not enabled consistently between product and debug builds. > > If the hardware supports Intel APX: > > 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. > > 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. > > **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** Verified patch with the following configurations, we now enable APX only on APX-capable targets with -XX:+UnlockExperimentalVMOptions CPROMPT>sde64 -dmr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:-UseAPX --version | grep UseAPX bool UseAPX = false {ARCH experimental} {command line} CPROMPT>sde64 -dmr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:+UseAPX --version | grep UseAPX bool UseAPX = true {ARCH experimental} {command line} CPROMPT>sde64 -dmr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions --version | grep UseAPX bool UseAPX = false {ARCH experimental} {default} CPROMPT>sde64 -gnr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:+UseAPX --version | grep UseAPX OpenJDK 64-Bit Server VM warning: UseAPX is not supported on this CPU, setting it to false bool UseAPX = false {ARCH experimental} {command line} CPROMPT>sde64 -gnr -ptr_raise -- java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -XX:-UseAPX --version | grep UseAPX bool UseAPX = false {ARCH experimental} {command line} ------------- PR Comment: https://git.openjdk.org/jdk/pull/26029#issuecomment-3015108073 From lmesnik at openjdk.org Sun Jun 29 17:42:50 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sun, 29 Jun 2025 17:42:50 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint Message-ID: The segv/eav happens in the case if JvmtiBreakpoint::_method's class redefined old between getting the Method* from jmethodid in the JvmtiEnv::SetBreakpoint(Method* method, jlocation location) {..} and and actual setting breakpoint in the VM operation VM_ChangeBreakpoints. Here are details: The breakpoint is set in 2 steps. 1) method jvmti_SetBreakpoint(jvmtiEnv* env, jmethodID method, jlocation location) convert jmethodID to Method* and call JvmtiEnv::SetBreakpoint(Method* method, jlocation location) where JvmtiBreakpoint bp(method, location); is created with this Method* Note: it is done while thread is in VM state, so Method can't become is_old while this is done. 2) The VMOp is used to add breakpoint into the list VM_ChangeBreakpoints set_breakpoint(VM_ChangeBreakpoints::SET_BREAKPOINT, &bp); VMThread::execute(&set_breakpoint); to call JvmtiBreakpoints::set_at_safepoint() that can modify JvmtiBreakpoints list and set breakpoint in safepoint without synchronization. So it might be possible that class redefinition VM_RedefineClasses operation that redefine the class with this breakpoint happens between steps 1) and 2) VM_RedefineClasses::redefine_single_class() clear all class-related breakpoints in the JvmtiBreakpoints, however the "problematic" breakpoint is in VMThread queue and thus we are still continue to do this operation. So in the step 2) the the JvmtiBreakpoint with 'is_old' method is added to the JvmtiBreakpoints and breakpoint is set. Then old method mights be purged any time once they are not on the stack and any access to this breakpoint could lead to usage of Metthod* _method pointing to deallocated metaspace. The VM_RedefineClasses clear all breakpoints so it is correct just to don't proceed with current breakpoint also. Looks, like very unlikely but reproducing with stress test after some time. Verified that the crash is not reproduced anymore with corresponding test after the fix. ------------- Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/26031/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26031&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8359366 Stats: 10 lines in 2 files changed: 8 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26031.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26031/head:pull/26031 PR: https://git.openjdk.org/jdk/pull/26031 From dholmes at openjdk.org Mon Jun 30 01:41:55 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 30 Jun 2025 01:41:55 GMT Subject: RFR: 8357601: Checked version of JNI ReleaseArrayElements needs to filter out known wrapped arrays Message-ID: The checked version of `Get`/`ReleaseArrayElements` uses `GuardedMemory` to perform error checking. When releasing the array the code needs to check for the known array tags from the other JNI APIs and report an error. We also expand `GuardedMemory` to allow for a second tag word so that we can discriminate additional allocation sites i.e. identifying use of `Get`/`SetPrimitiveArrayCritical`. And add further robustness to guard verification by using `SafeFetch`. ------------- Commit messages: - newlines at end-of-file - Merge branch 'master' into 8357601-jni - interim - fixed indent - Missing header include on Windows slowdebug - Improve robustness of guard verification by using SafeFetch - Merge branch 'master' into 8357601-jni - 8357601: Checked version of JNI ReleaseArrayElements needs to filter out known wrapped arrays Changes: https://git.openjdk.org/jdk/pull/25444/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25444&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357601 Stats: 343 lines in 6 files changed: 321 ins; 6 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/25444.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25444/head:pull/25444 PR: https://git.openjdk.org/jdk/pull/25444 From coleenp at openjdk.org Mon Jun 30 01:41:55 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 30 Jun 2025 01:41:55 GMT Subject: RFR: 8357601: Checked version of JNI ReleaseArrayElements needs to filter out known wrapped arrays In-Reply-To: References: Message-ID: <7yKgOu_xQq36unKaBPVvMwJSwosu2nYN3DUwgFQtxqI=.6d563f83-b139-41ce-8dce-9d727132e82f@github.com> On Mon, 26 May 2025 08:56:09 GMT, David Holmes wrote: > The checked version of `Get`/`ReleaseArrayElements` uses `GuardedMemory` to perform error checking. When releasing the array the code needs to check for the known array tags from the other JNI APIs and report an error. > > We also expand `GuardedMemory` to allow for a second tag word so that we can discriminate additional allocation sites i.e. identifying use of `Get`/`SetPrimitiveArrayCritical`. And add further robustness to guard verification by using `SafeFetch`. So tag is STRING_TAG and STRING_UTF_TAG and the purpose of tag2 is CRITICAL_TAG? Maybe just call it critical_tag()? src/hotspot/share/memory/guardedMemory.hpp line 249: > 247: void* get_tag() const { return get_head_guard()->get_tag(); } > 248: > 249: /** Extra whitespace. Why these blocky comments? That say the same thing twice in 5 lines. src/hotspot/share/prims/jniCheck.cpp line 357: > 355: > 356: // Arbitrary (but well-known) tag for GetStringUTFChars > 357: const void* STRING_UTF_TAG = (void*) 0x48124812; Why is this well-known? This ending in 12 could be an address, do you not want to make this a possible address? ------------- PR Review: https://git.openjdk.org/jdk/pull/25444#pullrequestreview-2872732848 PR Review Comment: https://git.openjdk.org/jdk/pull/25444#discussion_r2110303265 PR Review Comment: https://git.openjdk.org/jdk/pull/25444#discussion_r2110306715 From dholmes at openjdk.org Mon Jun 30 01:41:55 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 30 Jun 2025 01:41:55 GMT Subject: RFR: 8357601: Checked version of JNI ReleaseArrayElements needs to filter out known wrapped arrays In-Reply-To: <7yKgOu_xQq36unKaBPVvMwJSwosu2nYN3DUwgFQtxqI=.6d563f83-b139-41ce-8dce-9d727132e82f@github.com> References: <7yKgOu_xQq36unKaBPVvMwJSwosu2nYN3DUwgFQtxqI=.6d563f83-b139-41ce-8dce-9d727132e82f@github.com> Message-ID: On Tue, 27 May 2025 21:41:15 GMT, Coleen Phillimore wrote: > So tag is STRING_TAG and STRING_UTF_TAG and the purpose of tag2 is CRITICAL_TAG? Maybe just call it critical_tag()? Both `tag` and `tag2` are general purpose from the perspective of the `GuardedMemory` class. It is the user of `GuardedMemory` that assigns a meaning to them. > src/hotspot/share/memory/guardedMemory.hpp line 249: > >> 247: void* get_tag() const { return get_head_guard()->get_tag(); } >> 248: >> 249: /** > > Extra whitespace. Why these blocky comments? That say the same thing twice in 5 lines. I am just copying the existing style in this code. It is based on old javadoc style for Java code. I have fixed the indentation error. Thanks > src/hotspot/share/prims/jniCheck.cpp line 357: > >> 355: >> 356: // Arbitrary (but well-known) tag for GetStringUTFChars >> 357: const void* STRING_UTF_TAG = (void*) 0x48124812; > > Why is this well-known? This ending in 12 could be an address, do you not want to make this a possible address? These are "well-known" by this code. Note I just moved these definitions, I did not invent them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25444#issuecomment-2918183046 PR Review Comment: https://git.openjdk.org/jdk/pull/25444#discussion_r2113118037 PR Review Comment: https://git.openjdk.org/jdk/pull/25444#discussion_r2113115923 From dholmes at openjdk.org Mon Jun 30 05:00:39 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 30 Jun 2025 05:00:39 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: References: Message-ID: On Sat, 28 Jun 2025 05:02:56 GMT, Leonid Mesnik wrote: > The segv/eav happens in the case if JvmtiBreakpoint::_method's class redefined old between getting the Method* from jmethodid in the > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) {..} and > and actual setting breakpoint in the VM operation VM_ChangeBreakpoints. > > Here are details: > The breakpoint is set in 2 steps. > 1) method jvmti_SetBreakpoint(jvmtiEnv* env, jmethodID method, jlocation location) convert jmethodID to Method* and call > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) > where > JvmtiBreakpoint bp(method, location); > is created with this Method* > Note: it is done while thread is in VM state, so Method can't become is_old while this is done. > > 2) The VMOp is used to add breakpoint into the list > VM_ChangeBreakpoints set_breakpoint(VM_ChangeBreakpoints::SET_BREAKPOINT, &bp); > VMThread::execute(&set_breakpoint); > to call JvmtiBreakpoints::set_at_safepoint() > that can modify JvmtiBreakpoints list and set breakpoint in safepoint without synchronization. > > So it might be possible that class redefinition VM_RedefineClasses operation that redefine the class with this breakpoint happens between steps 1) and 2) > VM_RedefineClasses::redefine_single_class() > clear all class-related breakpoints in the JvmtiBreakpoints, however the "problematic" breakpoint is in VMThread queue and thus we are still continue to do this operation. > So in the step 2) the the JvmtiBreakpoint with 'is_old' method is added to the JvmtiBreakpoints and breakpoint is set. > > Then old method mights be purged any time once they are not on the stack and any access to this breakpoint could lead to usage of Metthod* _method pointing to deallocated metaspace. > > The VM_RedefineClasses clear all breakpoints so it is correct just to don't proceed with current breakpoint also. > > Looks, like very unlikely but reproducing with stress test after some time. > Verified that the crash is not reproduced anymore with corresponding test after the fix. Approach seems reasonable but it is worrisome that we still have these kinds of issues with class redefinition! And why has this suddenly appeared? Did a recent code change introduce this bug? There are a few typos in the change. src/hotspot/share/prims/jvmtiImpl.cpp line 188: > 186: > 187: void VM_ChangeBreakpoints::doit() { > 188: if (_bp->method() != Method::resolve_jmethod_id(_preservred_method)) { Suggestion: if (_bp->method() != Method::resolve_jmethod_id(_preserved_method)) { src/hotspot/share/prims/jvmtiImpl.cpp line 189: > 187: void VM_ChangeBreakpoints::doit() { > 188: if (_bp->method() != Method::resolve_jmethod_id(_preservred_method)) { > 189: // the jmethod_id's method was updated if class redefintion happened for this class Suggestion: // the jmethod_id's method was updated if class redefinition happened for this class src/hotspot/share/prims/jvmtiImpl.cpp line 191: > 189: // the jmethod_id's method was updated if class redefintion happened for this class > 190: // after JvmtBreakpoint was created but before JVM_ChangeBreakpoints started > 191: // all class breakpoints are cleared during redefinition so don't set/clear this breakpoint So basically one thread is trying to change this particular BP and it races with another thread that performs redefinition. If the redefinition thread wins then we are turning this current change into a no-op on the basis that the redefinition cleared all BPs anyway so we should not now set this one (if that was requested). It is very unclear to me how the thread that requested the current change might respond to that request being ignored. Please add some punctuation to the comment block as it is very hard to read at present. Thanks src/hotspot/share/prims/jvmtiImpl.hpp line 144: > 142: int _operation; > 143: JvmtiBreakpoint* _bp; > 144: jmethodID _preservred_method; //needed to track class redefintion Suggestion: jmethodID _preserved_method; //needed to track class redefinition src/hotspot/share/prims/jvmtiImpl.hpp line 154: > 152: _operation = operation; > 153: assert(bp != nullptr, "bp != null"); > 154: _preservred_method = bp->method()->jmethod_id(); Suggestion: _preserved_method = bp->method()->jmethod_id(); ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26031#pullrequestreview-2969873307 PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174205095 PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174206113 PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174209649 PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174201993 PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174202853 From thartmann at openjdk.org Mon Jun 30 05:25:39 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 30 Jun 2025 05:25:39 GMT Subject: RFR: 8360867: CTW: Disable inline cache verification In-Reply-To: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> References: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> Message-ID: <6mNRPKtIp7wFgE7ietaDa5S0dy3cWKh9u0jlaJdL6NM=.821fb76a-87f6-47fd-8c67-e72af8879c03@github.com> On Fri, 27 Jun 2025 10:30:30 GMT, Aleksey Shipilev wrote: > In CTW profiling, I noticed we spend a lot of time doing inline cache verification when nmethods are unloaded. Due to the nature of CTW, we unload _a lot_ of nmethods. Since the goal for CTW is to stress the compilers themselves, not inline caches in particular (I assume those are blank even, given almost no real code is executed), it makes sense to disable that verification for CTW. > > A taste of performance improvement, about 2%: > > > $ time CONF=linux-x86_64-server-fastdebug make test TEST=applications/ctw/modules > > # Current > real 5m1.616s > user 79m41.398s > sys 14m39.607s > > # No verify inline caches > real 4m52.239s > user 77m41.886s > sys 14m25.352s > > > Additional testing: > - [x] Linux x86_64 server {fastdebug,release}, `applications/ctw/modules` Looks good to me too. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26016#pullrequestreview-2969934502 From jbhateja at openjdk.org Mon Jun 30 05:34:43 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 30 Jun 2025 05:34:43 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v2] In-Reply-To: References: Message-ID: On Wed, 25 Jun 2025 15:43:14 GMT, Manuel H?ssig wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Update comments > > Thank you for implementing these new instructions! I had a look at your changes and have a few minor suggestions and questions. I am quite new to this part of the codebase, so feel free to disagree if I am way off base. > > How did you test these changes? > > Also, if you merge the current master branch, the Windows build failures in the Github Actions will be fixed. Hi @mhaessig , your comments have been addressed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25914#issuecomment-3017856854 From lmesnik at openjdk.org Mon Jun 30 05:56:41 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 30 Jun 2025 05:56:41 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: References: Message-ID: <_rEYQYLuKwA8su1Kc9Vi-m1kBUVieilE808P1RfkOj8=.7566a310-66d8-4db9-b4cf-7b6e7d742c56@github.com> On Mon, 30 Jun 2025 04:58:32 GMT, David Holmes wrote: > Approach seems reasonable but it is worrisome that we still have these kinds of issues with class redefinition! And why has this suddenly appeared? Did a recent code change introduce this bug? There are few things to note here: 1) I recently updated RunThese test to have more testing in this areas. 2) Coleen, who helped me with class redefinition (many thanks to her!!!) pointed to the change back in 2021 related to this problem https://github.com/coleenp/jdk/commit/a05e8e24224b047584c3a273fa7b4fef66798dd6 However, it was not introduced the problem and we would have the same issues before the fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26031#issuecomment-3017892712 From lmesnik at openjdk.org Mon Jun 30 06:05:39 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 30 Jun 2025 06:05:39 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: References: Message-ID: On Mon, 30 Jun 2025 04:42:59 GMT, David Holmes wrote: >> The segv/eav happens in the case if JvmtiBreakpoint::_method's class redefined old between getting the Method* from jmethodid in the >> JvmtiEnv::SetBreakpoint(Method* method, jlocation location) {..} and >> and actual setting breakpoint in the VM operation VM_ChangeBreakpoints. >> >> Here are details: >> The breakpoint is set in 2 steps. >> 1) method jvmti_SetBreakpoint(jvmtiEnv* env, jmethodID method, jlocation location) convert jmethodID to Method* and call >> JvmtiEnv::SetBreakpoint(Method* method, jlocation location) >> where >> JvmtiBreakpoint bp(method, location); >> is created with this Method* >> Note: it is done while thread is in VM state, so Method can't become is_old while this is done. >> >> 2) The VMOp is used to add breakpoint into the list >> VM_ChangeBreakpoints set_breakpoint(VM_ChangeBreakpoints::SET_BREAKPOINT, &bp); >> VMThread::execute(&set_breakpoint); >> to call JvmtiBreakpoints::set_at_safepoint() >> that can modify JvmtiBreakpoints list and set breakpoint in safepoint without synchronization. >> >> So it might be possible that class redefinition VM_RedefineClasses operation that redefine the class with this breakpoint happens between steps 1) and 2) >> VM_RedefineClasses::redefine_single_class() >> clear all class-related breakpoints in the JvmtiBreakpoints, however the "problematic" breakpoint is in VMThread queue and thus we are still continue to do this operation. >> So in the step 2) the the JvmtiBreakpoint with 'is_old' method is added to the JvmtiBreakpoints and breakpoint is set. >> >> Then old method mights be purged any time once they are not on the stack and any access to this breakpoint could lead to usage of Metthod* _method pointing to deallocated metaspace. >> >> The VM_RedefineClasses clear all breakpoints so it is correct just to don't proceed with current breakpoint also. >> >> Looks, like very unlikely but reproducing with stress test after some time. >> Verified that the crash is not reproduced anymore with corresponding test after the fix. >> >> Many thanks to Coleen for detailed explanation of class redefinition. > > src/hotspot/share/prims/jvmtiImpl.cpp line 191: > >> 189: // the jmethod_id's method was updated if class redefintion happened for this class >> 190: // after JvmtBreakpoint was created but before JVM_ChangeBreakpoints started >> 191: // all class breakpoints are cleared during redefinition so don't set/clear this breakpoint > > So basically one thread is trying to change this particular BP and it races with another thread that performs redefinition. If the redefinition thread wins then we are turning this current change into a no-op on the basis that the redefinition cleared all BPs anyway so we should not now set this one (if that was requested). > > It is very unclear to me how the thread that requested the current change might respond to that request being ignored. > > Please add some punctuation to the comment block as it is very hard to read at present. Thanks Thanks for feedback. > It is very unclear to me how the thread that requested the current change might respond to that request being ignored. > The SetBreakpoint returns 'JVMTI_ERROR_NONE' just like method set breakpoint and the breakpoint was removed later by class redfinition. BTW The alternative would be to replace _method with new version and set breakpoint like we first redefine classes and then set breakpoint. The both approaches looks valid for specification, I think. > Please add some punctuation to the comment block as it is very hard to read at present. Thanks will do this a little bit later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174287943 From amitkumar at openjdk.org Mon Jun 30 06:16:37 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 30 Jun 2025 06:16:37 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 15:58:31 GMT, Andrew Dinn wrote: > Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. @adinn I got one test failure on s390: `test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java` java.lang.RuntimeException: 1 < 2 at jdk.test.lib.Asserts.fail(Asserts.java:715) at MachCodeFramesInErrorFile.run(MachCodeFramesInErrorFile.java:170) at MachCodeFramesInErrorFile.main(MachCodeFramesInErrorFile.java:108) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) at java.base/java.lang.Thread.run(Thread.java:1474) I didn't hs_err even in full verbose. But attaching overall run in txt file: [26004_test_failure.txt](https://github.com/user-attachments/files/20973513/26004_test_failure.txt) ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3017928879 From dholmes at openjdk.org Mon Jun 30 06:57:40 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 30 Jun 2025 06:57:40 GMT Subject: RFR: 8360776: Disable Intel APX by default and enable it only if requested by the user using -XX:+UnlockExperimentalVMOptions -XX:+UseAPX In-Reply-To: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Message-ID: <36HotwhfTE2LRbNu1-KPRyspE4sNQB3hMxMo3eWmabY=.ac019cfe-5820-4afc-be53-3ed90a4381a6@github.com> On Fri, 27 Jun 2025 22:13:47 GMT, Srinivas Vamsi Parasa wrote: > Currently, APX is not enabled consistently between product and debug builds. > > If the hardware supports Intel APX: > > 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. > > 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. > > **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** src/hotspot/cpu/x86/vm_version_x86.cpp line 3155: > 3153: } > 3154: // Enable APX support for product builds after > 3155: // completion of planned features listed in JDK-8329030. So you have decided not to follow the original plan ( as JDK-8329030 is not complete) and instead go ahead and enable APX in product mode now. Why? Was this discussed anywhere? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26029#discussion_r2174355658 From duke at openjdk.org Mon Jun 30 07:43:21 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 30 Jun 2025 07:43:21 GMT Subject: RFR: 8284016: Normalize handshake closure names [v2] In-Reply-To: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Message-ID: > Hi, please consider the following changes: > > There are many classes inherited from the `HandshakeClosure` class, but they do not follow the same naming convention. In this PR we address this issue, all names are normalized in the following way: > > `XXXDummyClassNameClosure -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXDummyClassNameHandshake -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXStrangeClassName -> SomewhatSimilarNameHandshakeClosure` > > Tested in GHA and tiers 1 - 3. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8284016: Realigned parameters ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26014/files - new: https://git.openjdk.org/jdk/pull/26014/files/fe991cc8..eeb302df Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26014&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26014&range=00-01 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/26014.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26014/head:pull/26014 PR: https://git.openjdk.org/jdk/pull/26014 From duke at openjdk.org Mon Jun 30 07:43:22 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 30 Jun 2025 07:43:22 GMT Subject: RFR: 8284016: Normalize handshake closure names [v2] In-Reply-To: <7RFaowLX2qq3wCI-adTUMmf6QQt4AjF8ocLkwyNHVJg=.a4e02678-9114-4aef-a4a7-50508d60d736@github.com> References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> <7RFaowLX2qq3wCI-adTUMmf6QQt4AjF8ocLkwyNHVJg=.a4e02678-9114-4aef-a4a7-50508d60d736@github.com> Message-ID: On Fri, 27 Jun 2025 16:45:44 GMT, Coleen Phillimore wrote: > I found a couple of places to realign the parameters, but otherwise this looks good. I like the new naming conventions. We have a lot of handshakes now! Were you able to build shenandoah (not built by default, need to add --enable-jvm-feature-shenandoahgc to configure)? Yes, the code compiles fine with --enable-jvm-feature-shenandoahgc added. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 629: > >> 627: public: >> 628: GetCurrentContendedMonitorHandshakeClosure(JvmtiEnv *env, >> 629: JavaThread* calling_thread, > > Can you realign these parameters? Addressed. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 650: > >> 648: public: >> 649: GetStackTraceHandshakeClosure(JvmtiEnv *env, jint start_depth, jint max_count, >> 650: jvmtiFrameInfo* frame_buffer, jint* count_ptr) > > realign parameters. Addressed. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 760: > >> 758: public: >> 759: GetSingleStackTraceHandshakeClosure(JvmtiEnv *env, JavaThread *calling_thread, >> 760: jthread thread, jint max_frame_count) > > Also realign parameters. Addressed. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 798: > >> 796: public: >> 797: GetFrameLocationHandshakeClosure(JvmtiEnv *env, jint depth, >> 798: jmethodID* method_ptr, jlocation* location_ptr) > > Also realign parameters. Addressed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26014#issuecomment-3018119426 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2174428232 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2174428418 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2174428575 PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2174429046 From duke at openjdk.org Mon Jun 30 07:43:22 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 30 Jun 2025 07:43:22 GMT Subject: RFR: 8284016: Normalize handshake closure names [v2] In-Reply-To: References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Message-ID: On Fri, 27 Jun 2025 19:02:46 GMT, Serguei Spitsyn wrote: >> Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8284016: Realigned parameters > > src/hotspot/share/prims/jvmtiEnvBase.hpp line 511: > >> 509: }; >> 510: >> 511: class SetForceEarlyReturnHandshakeClosure : public JvmtiUnitedHandshakeClosure { > > I do not support this unification over JVMTI files. This make `HandshakeClosure` class names too long. > The JVMTI has a consistent local naming convention to have the suffix `Closure` at the end instead of `HandshakeClosure`. And it is fine because normally there are no other kind of closures in JVMTI code. @sspitsyn How about this: instead of xxxHandshakeClosure put xxxHSClosure? HS would stand for Handshake, but the length will increase by only 2 symbols. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2174425304 From mhaessig at openjdk.org Mon Jun 30 08:07:41 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Mon, 30 Jun 2025 08:07:41 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v4] In-Reply-To: References: Message-ID: On Fri, 27 Jun 2025 08:54:23 GMT, Jatin Bhateja wrote: >> Intel@ AVX10 ISA [1] extensions added new floating point MIN/MAX instructions which comply with definitions in IEEE-754-2019 standard section 9.6 and can directly emulate Math.min/max semantics without the need for any special handling for NaN, +0.0 or -0.0 detection. >> >> **The following pseudo-code describes the existing algorithm for min/max[FD]:** >> >> Move the non-negative value to the second operand; this will ensure that we correctly handle 0.0 and -0.0 values, if values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. Existing MINPS and MAXPS semantics only check for NaN as the second operand; hence, we need special handling to check for NaN at the first operand. >> >> btmp = (b < +0.0) ? a : b >> atmp = (b < +0.0) ? b : a >> Tmp = Max_Float(atmp , btmp) >> Res = (atmp == NaN) ? atmp : Tmp >> >> For min[FD] we need a small tweak in the above algorithm, i.e., move the non-negative value to the first operand, this will ensure that we correctly select -0.0 if both the operands being compared are 0.0 or -0.0. >> >> btmp = (b < +0.0) ? b : a >> atmp = (b < +0.0) ? a : b >> Tmp = Max_Float(atmp , btmp) >> Res = (atmp == NaN) ? atmp : Tmp >> >> Thus, we need additional special handling for NaNs and +/-0.0 to compute floating-point min/max values to comply with the semantics of Math.max/min APIs using existing MINPS / MAXPS instructions. AVX10.2 added a new instruction, VPMINMAX[SH,SS,SD]/[PH,PS,PD], which comprehensively handles special cases, thereby eliminating the need for special handling. >> >> Patch emits new instructions for reduction and non-reduction operations for single, double, and Float16 type. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin >> >> [1] https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review resolutions Hi @jatin-bhateja, thank you for addressing my feedback. It looks good to me know. src/hotspot/cpu/x86/x86_64.ad line 4529: > 4527: predicate(VM_Version::supports_avx10_2()); > 4528: match(Set dst (MinF a b)); > 4529: format %{ "maxF $dst, $a, $b" %} Suggestion: format %{ "minF $dst, $a, $b" %} This should the format match the instruction if I understand this correctly. src/hotspot/cpu/x86/x86_64.ad line 4565: > 4563: predicate(VM_Version::supports_avx10_2()); > 4564: match(Set dst (MinD a b)); > 4565: format %{ "maxD $dst, $a, $b" %} Suggestion: format %{ "minD $dst, $a, $b" %} ------------- Marked as reviewed by mhaessig (Committer). PR Review: https://git.openjdk.org/jdk/pull/25914#pullrequestreview-2970276248 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2174469497 PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2174467431 From jbhateja at openjdk.org Mon Jun 30 08:38:27 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 30 Jun 2025 08:38:27 GMT Subject: RFR: 8360116: Add support for AVX10 floating point minmax instruction [v5] In-Reply-To: References: Message-ID: > Intel@ AVX10 ISA [1] extensions added new floating point MIN/MAX instructions which comply with definitions in IEEE-754-2019 standard section 9.6 and can directly emulate Math.min/max semantics without the need for any special handling for NaN, +0.0 or -0.0 detection. > > **The following pseudo-code describes the existing algorithm for min/max[FD]:** > > Move the non-negative value to the second operand; this will ensure that we correctly handle 0.0 and -0.0 values, if values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. Existing MINPS and MAXPS semantics only check for NaN as the second operand; hence, we need special handling to check for NaN at the first operand. > > btmp = (b < +0.0) ? a : b > atmp = (b < +0.0) ? b : a > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > For min[FD] we need a small tweak in the above algorithm, i.e., move the non-negative value to the first operand, this will ensure that we correctly select -0.0 if both the operands being compared are 0.0 or -0.0. > > btmp = (b < +0.0) ? b : a > atmp = (b < +0.0) ? a : b > Tmp = Max_Float(atmp , btmp) > Res = (atmp == NaN) ? atmp : Tmp > > Thus, we need additional special handling for NaNs and +/-0.0 to compute floating-point min/max values to comply with the semantics of Math.max/min APIs using existing MINPS / MAXPS instructions. AVX10.2 added a new instruction, VPMINMAX[SH,SS,SD]/[PH,PS,PD], which comprehensively handles special cases, thereby eliminating the need for special handling. > > Patch emits new instructions for reduction and non-reduction operations for single, double, and Float16 type. > > Kindly review and share your feedback. > > Best Regards, > Jatin > > [1] https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10 Jatin Bhateja has updated the pull request incrementally with two additional commits since the last revision: - Update src/hotspot/cpu/x86/x86_64.ad Co-authored-by: Manuel H?ssig - Update src/hotspot/cpu/x86/x86_64.ad Co-authored-by: Manuel H?ssig ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25914/files - new: https://git.openjdk.org/jdk/pull/25914/files/89697983..5597b615 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25914&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25914.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25914/head:pull/25914 PR: https://git.openjdk.org/jdk/pull/25914 From tschatzl at openjdk.org Mon Jun 30 08:58:40 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Jun 2025 08:58:40 GMT Subject: RFR: 8338474: Parallel: Deprecate and obsolete PSChunkLargeArrays In-Reply-To: <1ZjYlg9V9HUV0H2Tk2222vuMl1rOAGJdSqFivaez1LU=.9f7213d9-34db-48c9-9faa-8e042eaadaf6@github.com> References: <1ZjYlg9V9HUV0H2Tk2222vuMl1rOAGJdSqFivaez1LU=.9f7213d9-34db-48c9-9faa-8e042eaadaf6@github.com> Message-ID: On Thu, 26 Jun 2025 09:55:46 GMT, Albert Mingkun Yang wrote: > Deprecating `PSChunkLargeArrays`, which is used only by Parallel and it is enabled by default. > > Disabling it offers little benefit, so removing it do reduce the number of commandline flags. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25997#pullrequestreview-2970441071 From jbhateja at openjdk.org Mon Jun 30 09:28:41 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 30 Jun 2025 09:28:41 GMT Subject: RFR: 8360775: Fix Shenandoah GC test failures when APX is enabled In-Reply-To: <66o1iImVgzmTapY0AEGZeAg_VTj4ZbRc1MSFvgA8qYk=.ab11bc1a-3cb0-4832-82da-4e97ee8aaf9b@github.com> References: <66o1iImVgzmTapY0AEGZeAg_VTj4ZbRc1MSFvgA8qYk=.ab11bc1a-3cb0-4832-82da-4e97ee8aaf9b@github.com> Message-ID: On Fri, 27 Jun 2025 00:02:08 GMT, Srinivas Vamsi Parasa wrote: > This PR fixes the test failures seen in many JTreg tests related to Shenandoah GC (`test/hotspot/jtreg/gc/shenandoah/`) with UseAPX. The issues were root caused to: > > 1. Higher band registers are not saved and restored in Shenandoah load_reference_barrier. > 2. Pusha/Popa implementation using push2p/pop2p > > Both the issues are fixed in this PR. src/hotspot/cpu/x86/assembler_x86.cpp line 15675: > 15673: void Assembler::pusha_uncached() { // 64bit > 15674: if (UseAPX) { > 15675: // Data being pushed by PUSH2 must be 16B-aligned on the stack, for this push rax upfront Hi @vamsi-parasa , PUSHA / POPA assembler is agnostic to the use of hardcoded registers in calling context, e.g. in following line of code https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp#L495 If dst and tmp1 are RAX then we endup currpting it since RAX is used as a scratch register for stack alignment, and in case RAX holds an oop pointer then we may see random crashes. Such idioms are limited to GC barreirs currently, and we have recently fixed one such issue in https://github.com/openjdk/jdk/pull/25351 While the instruction sequence of PUSHA/ POPA with PPX hints is correct, Do you think for the time being we should limit the scope of this fix to save_machine_state and restor_machine_state routines rather than making generic fix in pusha/popa ? I have tried it and it's working. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26009#discussion_r2174631978 From iwalulya at openjdk.org Mon Jun 30 09:45:43 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 30 Jun 2025 09:45:43 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v17] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: On Thu, 26 Jun 2025 19:17:11 GMT, Albert Mingkun Yang wrote: >> This patch refines Parallel's sizing strategy to improve overall memory management and performance. >> >> The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. >> >> `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. >> >> GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. >> >> ## Performance evaluation >> >> - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). >> - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). >> - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. >> >> PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. >> >> Test: tier1-8 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: > > - Merge branch 'master' into pgc-size-policy > - review > - cast > - remove-young-resize-after-full-gc > - Merge branch 'master' into pgc-size-policy > - Merge branch 'master' into pgc-size-policy > - review > - Merge branch 'master' into pgc-size-policy > - merge > - version > - ... and 15 more: https://git.openjdk.org/jdk/compare/20e0055e...eeda1eb8 Changes requested by iwalulya (Reviewer). src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 354: > 352: > 353: static bool check_gc_heap_free_limit(size_t free_bytes, size_t capacity_bytes) { > 354: return free_bytes * 100 / capacity_bytes < GCHeapFreeLimit; Suggestion: return (free_bytes * 100 / capacity_bytes) < GCHeapFreeLimit; src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 425: > 423: } > 424: > 425: if (check_gc_overhead_limit()) { What is the effect of calling this method twice? Line 400 above, and then again here on line 425. Does that increment `_gc_overhead_counter` twice? More reason why i think the name is confusing. src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 756: > 754: } > 755: > 756: static size_t calculate_free_from_free_ratio_flag(size_t live, uintx free_percent) { Why refer to the `free_ratio_flag` instead of just `calculate_free_from_free_percent`? ------------- PR Review: https://git.openjdk.org/jdk/pull/25000#pullrequestreview-2970469808 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2174590762 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2174637996 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2174646534 From iwalulya at openjdk.org Mon Jun 30 09:45:44 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 30 Jun 2025 09:45:44 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v3] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <2kNEq2gNwn7YKpY5Wj5yAxyKS7J0I64hQ_snRWbPayQ=.b9e3d87e-cb04-423e-b56d-e7a722c16ed5@github.com> On Mon, 19 May 2025 06:00:09 GMT, Albert Mingkun Yang wrote: >> src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 363: >> >>> 361: } >>> 362: >>> 363: bool ParallelScavengeHeap::check_gc_overhead_limit() { >> >> In main-line code, the method `check_gc_overhead_limit` is invoked by `PSScavenge::invoke` and `PSParallelCompact::invoke_no_policy` so that we can do the check after all the GCs. But now you only use `check_gc_overhead_limit` in `ParallelScavengeHeap::satisfy_failed_allocation`. I suspect whether it can check the gc overhead limit accurately. > >> so that we can do the check after all the GCs > > Well, not really. In the old impl, `GCOverheadChecker::check_gc_overhead_limit` calls `set_gc_overhead_limit_exceeded` only for full-gc. > > > But now you only use check_gc_overhead_limit in ParallelScavengeHeap::satisfy_failed_allocation. I suspect whether it can check the gc overhead limit accurately. > > I believe so. In the old impl, we don't check gc-overhead for explicit gcs. Only allocation-failure caused gcs are interesting, which all go through `satisfy_failed_allocation`. > > > // Ignore explicit GC's. Exiting here does not set the flag and > // does not reset the count. > if (GCCause::is_user_requested_gc(gc_cause) || > GCCause::is_serviceability_requested_gc(gc_cause)) { > return; > } `check_gc_overhead_limit` does more than `check`, can we find a more appropriate method name? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2174596366 From ayang at openjdk.org Mon Jun 30 10:17:11 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Jun 2025 10:17:11 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v18] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: <0RXvJz_kUMWD8_XusZ0XrlrLQj9960i-iur1w5gpwF4=.b6323262-211a-4ae2-bd2f-1350f6b390ca@github.com> > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: - review - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - cast - remove-young-resize-after-full-gc - Merge branch 'master' into pgc-size-policy - Merge branch 'master' into pgc-size-policy - review - Merge branch 'master' into pgc-size-policy - ... and 17 more: https://git.openjdk.org/jdk/compare/c2d76f98...ec2e3908 ------------- Changes: https://git.openjdk.org/jdk/pull/25000/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=17 Stats: 4362 lines in 31 files changed: 506 ins; 3470 del; 386 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From ayang at openjdk.org Mon Jun 30 10:17:11 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Jun 2025 10:17:11 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v3] In-Reply-To: <2kNEq2gNwn7YKpY5Wj5yAxyKS7J0I64hQ_snRWbPayQ=.b9e3d87e-cb04-423e-b56d-e7a722c16ed5@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> <2kNEq2gNwn7YKpY5Wj5yAxyKS7J0I64hQ_snRWbPayQ=.b9e3d87e-cb04-423e-b56d-e7a722c16ed5@github.com> Message-ID: On Mon, 30 Jun 2025 09:08:28 GMT, Ivan Walulya wrote: >>> so that we can do the check after all the GCs >> >> Well, not really. In the old impl, `GCOverheadChecker::check_gc_overhead_limit` calls `set_gc_overhead_limit_exceeded` only for full-gc. >> >> > But now you only use check_gc_overhead_limit in ParallelScavengeHeap::satisfy_failed_allocation. I suspect whether it can check the gc overhead limit accurately. >> >> I believe so. In the old impl, we don't check gc-overhead for explicit gcs. Only allocation-failure caused gcs are interesting, which all go through `satisfy_failed_allocation`. >> >> >> // Ignore explicit GC's. Exiting here does not set the flag and >> // does not reset the count. >> if (GCCause::is_user_requested_gc(gc_cause) || >> GCCause::is_serviceability_requested_gc(gc_cause)) { >> return; >> } > > `check_gc_overhead_limit` does more than `check`, can we find a more appropriate method name? The only side-effect is mutating `_gc_overhead_counter`, which I believe is part of checking gc overhead limit. Do you have any names in mind? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2174712372 From ayang at openjdk.org Mon Jun 30 10:17:12 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Jun 2025 10:17:12 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v17] In-Reply-To: References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: On Mon, 30 Jun 2025 09:29:28 GMT, Ivan Walulya wrote: >> Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: >> >> - Merge branch 'master' into pgc-size-policy >> - review >> - cast >> - remove-young-resize-after-full-gc >> - Merge branch 'master' into pgc-size-policy >> - Merge branch 'master' into pgc-size-policy >> - review >> - Merge branch 'master' into pgc-size-policy >> - merge >> - version >> - ... and 15 more: https://git.openjdk.org/jdk/compare/20e0055e...eeda1eb8 > > src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 425: > >> 423: } >> 424: >> 425: if (check_gc_overhead_limit()) { > > What is the effect of calling this method twice? Line 400 above, and then again here on line 425. Does that increment `_gc_overhead_counter` twice? More reason why i think the name is confusing. This method checks after-gc gc-ratio and memory limit, so it is meant to be called after every (young/old) gc. (Increment `_gc_overhead_counter` is an impl detail, that is not visible at this abstraction level, IMO.) > src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 756: > >> 754: } >> 755: >> 756: static size_t calculate_free_from_free_ratio_flag(size_t live, uintx free_percent) { > > Why refer to the `free_ratio_flag` instead of just `calculate_free_from_free_percent`? It's mostly due to how the cmdline flag is named. See one of the callsite: `calculate_free_from_free_ratio_flag(old_gen_live_size, MinHeapFreeRatio);`. I think this method can be renamed after the cmdline flag is renamed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2174707957 PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2174709739 From ayang at openjdk.org Mon Jun 30 10:24:58 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 30 Jun 2025 10:24:58 GMT Subject: RFR: 8338977: Parallel: Improve heap resizing heuristics [v19] In-Reply-To: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> References: <9-QvRzQoMkyGxgiTAFpkizJOG8unI4JYBLYu7gigMMQ=.7257790b-1a27-4925-b88a-87c03b3ea536@github.com> Message-ID: > This patch refines Parallel's sizing strategy to improve overall memory management and performance. > > The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`. > > `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`. > > GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`. > > ## Performance evaluation > > - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)). > - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html). > - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing. > > PS: I have opportunistically set the obsolete/expired version to ~~25/26~~ 26/27 for now. I will update them accordingly before merging. > > Test: tier1-8 Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25000/files - new: https://git.openjdk.org/jdk/pull/25000/files/ec2e3908..d1ff874a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25000&range=17-18 Stats: 9 lines in 3 files changed: 2 ins; 2 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25000.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25000/head:pull/25000 PR: https://git.openjdk.org/jdk/pull/25000 From adinn at openjdk.org Mon Jun 30 10:27:41 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 30 Jun 2025 10:27:41 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Mon, 30 Jun 2025 06:14:32 GMT, Amit Kumar wrote: >> Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. > > @adinn I got one test failure on s390: `test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java` > > > java.lang.RuntimeException: 1 < 2 > at jdk.test.lib.Asserts.fail(Asserts.java:715) > at MachCodeFramesInErrorFile.run(MachCodeFramesInErrorFile.java:170) > at MachCodeFramesInErrorFile.main(MachCodeFramesInErrorFile.java:108) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) > at java.base/java.lang.reflect.Method.invoke(Method.java:565) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) > at java.base/java.lang.Thread.run(Thread.java:1474) > > > I didn't hs_err even in full verbose. But attaching overall run in txt file: > [26004_test_failure.txt](https://github.com/user-attachments/files/20973513/26004_test_failure.txt) @offamitkumar Thanks for testing. I'm looking into the issue and will get back to you. @TheRealMDoerr @offamitkumar @RealFYang Would you be able to check this on ppc/riscv to ensure it builds (cross-compile is ok) and passes (at least) tier1? Thanks for whatever help you can provide. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3018615494 From kevinw at openjdk.org Mon Jun 30 11:24:28 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 30 Jun 2025 11:24:28 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v7] In-Reply-To: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: > ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. Kevin Walls has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - remove test definition changes - TLH: use cv_internal_thread_to_JavaThread() - Merge remote-tracking branch 'upstream/master' into 8359870_threadexited - Test requires: permit linux debug testing - comment update - comment update - newline - Test fails on minimal VM: require jvmti feature - Correct THROW macro - ThreadDumper thread count - ... and 3 more: https://git.openjdk.org/jdk/compare/80dc9acb...e2043438 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25958/files - new: https://git.openjdk.org/jdk/pull/25958/files/d14f5228..e2043438 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25958&range=05-06 Stats: 7800 lines in 367 files changed: 3791 ins; 2583 del; 1426 mod Patch: https://git.openjdk.org/jdk/pull/25958.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25958/head:pull/25958 PR: https://git.openjdk.org/jdk/pull/25958 From dholmes at openjdk.org Mon Jun 30 11:24:28 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 30 Jun 2025 11:24:28 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <_qD_bLdpHkOQRTWkFV43-L_IGMMgOfosOn7zxb7I7gM=.e2afadc1-1ce1-45da-a695-1920686f8f5f@github.com> Message-ID: On Fri, 27 Jun 2025 20:22:12 GMT, Alex Menkov wrote: > I believe null here is not result of `_thread_h()`, but is returned by `java_lang_VirtualThread::continuation(...)` because `_thread_h` is lava.lang.Thread object and not java.lang.VirtualThread. That could only happen if we are dealing with a terminated regular thread - which we should never do here if the TLH is used correctly and we only ever pass live threads to `do_thread`, or else the null which means "unmounted virtual thread". ------------- PR Comment: https://git.openjdk.org/jdk/pull/25958#issuecomment-3017944031 From kevinw at openjdk.org Mon Jun 30 11:26:42 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 30 Jun 2025 11:26:42 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v5] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> <6sxeRs0jaGtjCoxcJLBksWiyacOvRkSn40GXcLNKEos=.e4736687-f826-4a11-977e-9b0cf765e046@github.com> Message-ID: <4yFDEaYCx6ooltPbt2hSAd2h2Khnf9ZbAQaZvcK8a8g=.259b71d9-1daa-421b-8226-7e928dfe0f77@github.com> On Thu, 26 Jun 2025 08:26:44 GMT, Kevin Walls wrote: >> Great catch Dan! I totally missed the TLH at the start of `get_thread_snapshot`. I knew something was off here but couldn't quite put my finger on it. > > Yes thanks Dan! Will update. Thanks for all the hints, updated to use the TLH. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25958#discussion_r2174844271 From tschatzl at openjdk.org Mon Jun 30 11:52:53 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Jun 2025 11:52:53 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v7] In-Reply-To: References: Message-ID: <0jZZt3phd8r5aQfkOh6bz6za8YO9jXSBCWO2vWU6kqM=.0365799f-b476-4df3-854d-7a0c9b825b57@github.com> On Mon, 30 Jun 2025 11:49:52 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge remote-tracking branch 'upstream/master' into G1HeapResizePolicyV2 > - Thomas Review > - Reviews > - Albert suggestions > - Merge remote-tracking branch 'upstream/master' into G1HeapResizePolicyV2 > - remove unrequired changes - kim > - clean init Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25832#pullrequestreview-2970897593 From iwalulya at openjdk.org Mon Jun 30 11:52:53 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 30 Jun 2025 11:52:53 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v7] In-Reply-To: References: Message-ID: > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge remote-tracking branch 'upstream/master' into G1HeapResizePolicyV2 - Thomas Review - Reviews - Albert suggestions - Merge remote-tracking branch 'upstream/master' into G1HeapResizePolicyV2 - remove unrequired changes - kim - clean init ------------- Changes: https://git.openjdk.org/jdk/pull/25832/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=06 Stats: 614 lines in 16 files changed: 391 ins; 89 del; 134 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From tschatzl at openjdk.org Mon Jun 30 11:52:53 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Jun 2025 11:52:53 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v6] In-Reply-To: References: Message-ID: <5tEew3q15QHfXmaXXLkVBcvVb1n3lp33blutHPNnrfo=.a724ffff-bc4c-4d57-adc1-11ccc2896e5a@github.com> On Fri, 27 Jun 2025 16:56:17 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Thomas Review src/hotspot/share/gc/g1/g1HeapSizingPolicy.hpp line 80: > 78: uint _long_term_count; > 79: > 80: // Clear GC CPU usage tracking data used by resize_amount(). Suggestion: // Clear GC CPU usage tracking data used by *resize_amount(). src/hotspot/share/gc/g1/g1_globals.hpp line 177: > 175: range(0, 100) \ > 176: \ > 177: product(uint, G1CPUUsageExpandThreshold, 4, DIAGNOSTIC, \ This, and `G1CPUUsageShrinkThreshold` need to have a range attached to it - if it is set to `0`, I think expansion won't work well initially. E.g. if set to 0, `_gc_cpu_usage_deviation_counter` will be initialized to 1, and since we compare with `==` in the code if ((_gc_cpu_usage_deviation_counter == (int)G1CPUUsageExpandThreshold) || (use_long_term_delta && (long_term_gc_cpu_usage > upper_threshold))) { the first term will not fire. Also, we increment `_gc_cpu_usage_deviation_counter` first, then compare. So a `range(1, MAX_UINT)` should be added to both. Or make the comparison a `>=` or `<=` (for the shrinking). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2174864384 PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2174884973 From coleenp at openjdk.org Mon Jun 30 11:56:41 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 30 Jun 2025 11:56:41 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: References: Message-ID: On Mon, 30 Jun 2025 04:38:41 GMT, David Holmes wrote: >> The segv/eav happens in the case if JvmtiBreakpoint::_method's class redefined old between getting the Method* from jmethodid in the >> JvmtiEnv::SetBreakpoint(Method* method, jlocation location) {..} and >> and actual setting breakpoint in the VM operation VM_ChangeBreakpoints. >> >> Here are details: >> The breakpoint is set in 2 steps. >> 1) method jvmti_SetBreakpoint(jvmtiEnv* env, jmethodID method, jlocation location) convert jmethodID to Method* and call >> JvmtiEnv::SetBreakpoint(Method* method, jlocation location) >> where >> JvmtiBreakpoint bp(method, location); >> is created with this Method* >> Note: it is done while thread is in VM state, so Method can't become is_old while this is done. >> >> 2) The VMOp is used to add breakpoint into the list >> VM_ChangeBreakpoints set_breakpoint(VM_ChangeBreakpoints::SET_BREAKPOINT, &bp); >> VMThread::execute(&set_breakpoint); >> to call JvmtiBreakpoints::set_at_safepoint() >> that can modify JvmtiBreakpoints list and set breakpoint in safepoint without synchronization. >> >> So it might be possible that class redefinition VM_RedefineClasses operation that redefine the class with this breakpoint happens between steps 1) and 2) >> VM_RedefineClasses::redefine_single_class() >> clear all class-related breakpoints in the JvmtiBreakpoints, however the "problematic" breakpoint is in VMThread queue and thus we are still continue to do this operation. >> So in the step 2) the the JvmtiBreakpoint with 'is_old' method is added to the JvmtiBreakpoints and breakpoint is set. >> >> Then old method mights be purged any time once they are not on the stack and any access to this breakpoint could lead to usage of Metthod* _method pointing to deallocated metaspace. >> >> The VM_RedefineClasses clear all breakpoints so it is correct just to don't proceed with current breakpoint also. >> >> Looks, like very unlikely but reproducing with stress test after some time. >> Verified that the crash is not reproduced anymore with corresponding test after the fix. >> >> Many thanks to Coleen for detailed explanation of class redefinition. > > src/hotspot/share/prims/jvmtiImpl.cpp line 188: > >> 186: >> 187: void VM_ChangeBreakpoints::doit() { >> 188: if (_bp->method() != Method::resolve_jmethod_id(_preservred_method)) { > > Suggestion: > > if (_bp->method() != Method::resolve_jmethod_id(_preserved_method)) { I see what you're doing. You're checking if the methodID is changed by redefinition. Can you just check if the method->is_old() and skip the breakpoint then? Although the callers might want a breakpoint at the new method if it's emcp. Maybe it could call Method::get_new_method() if is_old and set the breakpoint on the new method? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174895443 From duke at openjdk.org Mon Jun 30 11:57:23 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 30 Jun 2025 11:57:23 GMT Subject: RFR: 8284016: Normalize handshake closure names [v3] In-Reply-To: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Message-ID: <9DJF2VLI5jAPld-i8CiS7Dq-1cboTORtVdHzgNRLhpo=.fc678446-b706-49f9-9cdb-7fb97eeaf24d@github.com> > Hi, please consider the following changes: > > There are many classes inherited from the `HandshakeClosure` class, but they do not follow the same naming convention. In this PR we address this issue, all names are normalized in the following way: > > `XXXDummyClassNameClosure -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXDummyClassNameHandshake -> XXXDummyClassNameHandshakeClosure` > > or > > `XXXStrangeClassName -> SomewhatSimilarNameHandshakeClosure` > > Tested in GHA and tiers 1 - 3. Anton Artemov has updated the pull request incrementally with one additional commit since the last revision: 8284016: Reverted closure names in JVMTI ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26014/files - new: https://git.openjdk.org/jdk/pull/26014/files/eeb302df..65b9ecc5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26014&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26014&range=01-02 Stats: 76 lines in 7 files changed: 0 ins; 0 del; 76 mod Patch: https://git.openjdk.org/jdk/pull/26014.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26014/head:pull/26014 PR: https://git.openjdk.org/jdk/pull/26014 From duke at openjdk.org Mon Jun 30 11:57:23 2025 From: duke at openjdk.org (Anton Artemov) Date: Mon, 30 Jun 2025 11:57:23 GMT Subject: RFR: 8284016: Normalize handshake closure names [v3] In-Reply-To: References: <3MZO_Y636hhLaFN_TEP5GIPHS3ZD3UVV14nzmWexwP0=.72af617d-aac2-46c9-b825-9b5704061a75@github.com> Message-ID: On Fri, 27 Jun 2025 19:48:14 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/prims/jvmtiEnvBase.hpp line 511: >> >>> 509: }; >>> 510: >>> 511: class SetForceEarlyReturnHandshakeClosure : public JvmtiUnitedHandshakeClosure { >> >> I do not support this unification over JVMTI files. This make `HandshakeClosure` class names too long. >> The JVMTI has a consistent local naming convention to have the suffix `Closure` at the end instead of `HandshakeClosure`. And it is fine because normally there are no other kind of closures in JVMTI code. > > Aren't there closures in the JVM/TI tag processing code? I could be remembering wrong... I reverted closure names changes in JVMT files. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26014#discussion_r2174895464 From coleenp at openjdk.org Mon Jun 30 11:59:38 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 30 Jun 2025 11:59:38 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: References: Message-ID: On Mon, 30 Jun 2025 11:53:36 GMT, Coleen Phillimore wrote: >> src/hotspot/share/prims/jvmtiImpl.cpp line 188: >> >>> 186: >>> 187: void VM_ChangeBreakpoints::doit() { >>> 188: if (_bp->method() != Method::resolve_jmethod_id(_preservred_method)) { >> >> Suggestion: >> >> if (_bp->method() != Method::resolve_jmethod_id(_preserved_method)) { > > I see what you're doing. You're checking if the methodID is changed by redefinition. Can you just check if the method->is_old() and skip the breakpoint then? Although the callers might want a breakpoint at the new method if it's emcp. Maybe it could call Method::get_new_method() if is_old and set the breakpoint on the new method? Or as you say, the breakpoint will be cleared by the redefinition anyway if the VM_ChangeBreakpoints::doit() succeeded before it, so returning an error should be fine too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26031#discussion_r2174902032 From iwalulya at openjdk.org Mon Jun 30 12:38:25 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 30 Jun 2025 12:38:25 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v8] In-Reply-To: References: Message-ID: > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: Thomas Review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25832/files - new: https://git.openjdk.org/jdk/pull/25832/files/9340422a..5997940f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=06-07 Stats: 8 lines in 2 files changed: 1 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From iwalulya at openjdk.org Mon Jun 30 12:38:26 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 30 Jun 2025 12:38:26 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v6] In-Reply-To: <5tEew3q15QHfXmaXXLkVBcvVb1n3lp33blutHPNnrfo=.a724ffff-bc4c-4d57-adc1-11ccc2896e5a@github.com> References: <5tEew3q15QHfXmaXXLkVBcvVb1n3lp33blutHPNnrfo=.a724ffff-bc4c-4d57-adc1-11ccc2896e5a@github.com> Message-ID: <2OnR3JPN14mEAauGFmX_zUl4p8IDDQIosBZQDMbO6b4=.87abdef1-c9ca-47d0-9923-9d94ec505874@github.com> On Mon, 30 Jun 2025 11:47:57 GMT, Thomas Schatzl wrote: >> Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: >> >> Thomas Review > > src/hotspot/share/gc/g1/g1_globals.hpp line 177: > >> 175: range(0, 100) \ >> 176: \ >> 177: product(uint, G1CPUUsageExpandThreshold, 4, DIAGNOSTIC, \ > > This, and `G1CPUUsageShrinkThreshold` need to have a range attached to it - if it is set to `0`, I think expansion won't work well initially. > > E.g. if set to 0, `_gc_cpu_usage_deviation_counter` will be initialized to 1, and since we compare with `==` in the code > > > if ((_gc_cpu_usage_deviation_counter == (int)G1CPUUsageExpandThreshold) || > (use_long_term_delta && (long_term_gc_cpu_usage > upper_threshold))) { > > > the first term will not fire. Also, we increment `_gc_cpu_usage_deviation_counter` first, then compare. > > So a `range(1, MAX_UINT)` should be added to both. Or make the comparison a `>=` or `<=` (for the shrinking). Resolved by improving the constraint function and also changing the comparisons to `G1CPUUsageExpandThreshold` and `G1CPUUsageShrinkThreshold`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2174970288 From iwalulya at openjdk.org Mon Jun 30 12:42:11 2025 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 30 Jun 2025 12:42:11 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v9] In-Reply-To: References: Message-ID: <24IBqFegflzzJC1jB1t7LFUasQXD59KqTefhtGq71PU=.0dbe6333-bb7b-415f-aa68-b8eeede95627@github.com> > Hi all, > > Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. > > The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. > > - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. > > - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. > > - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. > > We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. > > Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. > > As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. > > Testing: Mach5 Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: Improve comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25832/files - new: https://git.openjdk.org/jdk/pull/25832/files/5997940f..88434073 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25832&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25832/head:pull/25832 PR: https://git.openjdk.org/jdk/pull/25832 From shade at openjdk.org Mon Jun 30 12:58:52 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 30 Jun 2025 12:58:52 GMT Subject: RFR: 8360867: CTW: Disable inline cache verification In-Reply-To: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> References: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> Message-ID: <244l7ZPhia4PItuuNp8uVBU9Ht_cI3Qd9GxzkF3cdm4=.60b25673-8f55-45fb-9690-775ff351d3ed@github.com> On Fri, 27 Jun 2025 10:30:30 GMT, Aleksey Shipilev wrote: > In CTW profiling, I noticed we spend a lot of time doing inline cache verification when nmethods are unloaded. Due to the nature of CTW, we unload _a lot_ of nmethods. Since the goal for CTW is to stress the compilers themselves, not inline caches in particular (I assume those are blank even, given almost no real code is executed), it makes sense to disable that verification for CTW. > > A taste of performance improvement, about 2%: > > > $ time CONF=linux-x86_64-server-fastdebug make test TEST=applications/ctw/modules > > # Current > real 5m1.616s > user 79m41.398s > sys 14m39.607s > > # No verify inline caches > real 4m52.239s > user 77m41.886s > sys 14m25.352s > > > Additional testing: > - [x] Linux x86_64 server {fastdebug,release}, `applications/ctw/modules` Thank you! Integrating now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26016#issuecomment-3019046938 From shade at openjdk.org Mon Jun 30 12:58:52 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 30 Jun 2025 12:58:52 GMT Subject: Integrated: 8360867: CTW: Disable inline cache verification In-Reply-To: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> References: <9mywTtljvOZuQZWkTaHipBEeiC_3EMxihOd790Uq79o=.de521785-183c-44f4-a799-9f695922d9e3@github.com> Message-ID: On Fri, 27 Jun 2025 10:30:30 GMT, Aleksey Shipilev wrote: > In CTW profiling, I noticed we spend a lot of time doing inline cache verification when nmethods are unloaded. Due to the nature of CTW, we unload _a lot_ of nmethods. Since the goal for CTW is to stress the compilers themselves, not inline caches in particular (I assume those are blank even, given almost no real code is executed), it makes sense to disable that verification for CTW. > > A taste of performance improvement, about 2%: > > > $ time CONF=linux-x86_64-server-fastdebug make test TEST=applications/ctw/modules > > # Current > real 5m1.616s > user 79m41.398s > sys 14m39.607s > > # No verify inline caches > real 4m52.239s > user 77m41.886s > sys 14m25.352s > > > Additional testing: > - [x] Linux x86_64 server {fastdebug,release}, `applications/ctw/modules` This pull request has now been integrated. Changeset: aa191119 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/aa1911191cf8c2b855268a76baf0757909d66d1b Stats: 6 lines in 3 files changed: 6 ins; 0 del; 0 mod 8360867: CTW: Disable inline cache verification Reviewed-by: kvn, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/26016 From asemenov at openjdk.org Mon Jun 30 13:03:23 2025 From: asemenov at openjdk.org (Artem Semenov) Date: Mon, 30 Jun 2025 13:03:23 GMT Subject: RFR: 8360664: Null pointer dereference in src/hotspot/share/prims/jvmtiTagMap.cpp in IterateOverHeapObjectClosure::do_object() [v3] In-Reply-To: References: Message-ID: > The defect has been detected and confirmed in the function ```IterateOverHeapObjectClosure::do_object()``` located in the file ```src/hotspot/share/prims/jvmtiTagMap.cpp``` with static code analysis. This defect can potentially lead to a null pointer dereference. > > The pointer ```oop o``` is passed to the constructor of the CallbackWrapper class, where it is dereferenced without a null check. Artem Semenov has updated the pull request incrementally with one additional commit since the last revision: changed if tu assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26002/files - new: https://git.openjdk.org/jdk/pull/26002/files/e69c49c8..88f1e494 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26002&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26002&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26002.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26002/head:pull/26002 PR: https://git.openjdk.org/jdk/pull/26002 From tschatzl at openjdk.org Mon Jun 30 13:30:42 2025 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 30 Jun 2025 13:30:42 GMT Subject: RFR: 8238687: Investigate memory uncommit during young collections in G1 [v9] In-Reply-To: <24IBqFegflzzJC1jB1t7LFUasQXD59KqTefhtGq71PU=.0dbe6333-bb7b-415f-aa68-b8eeede95627@github.com> References: <24IBqFegflzzJC1jB1t7LFUasQXD59KqTefhtGq71PU=.0dbe6333-bb7b-415f-aa68-b8eeede95627@github.com> Message-ID: On Mon, 30 Jun 2025 12:42:11 GMT, Ivan Walulya wrote: >> Hi all, >> >> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time. Therefore, we need to change G1?s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants. >> >> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio. >> >> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly. >> >> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range. >> >> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants. >> >> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead. >> >> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs. >> >> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio. >> >> Testing: Mach5 ... > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Improve comment Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25832#pullrequestreview-2971240450 From yzheng at openjdk.org Mon Jun 30 13:55:27 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 30 Jun 2025 13:55:27 GMT Subject: RFR: 8357424: [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod [v8] In-Reply-To: References: Message-ID: > Hosted Truffle compilations are installed on the OptimizedCallTarget#profiledPERoot method. Any deoptimization contributes to its decompile count, which can easily exceed the PerMethodRecompilationCutoff threshold, permanently preventing highest tier compilation on this method. This PR exempts hosted compilations from this cutoff by ensuring their decompile count is not incremented for hosted compiled nmethods. Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: - Merge tag 'jdk-26+3' into JDK-8357424 Added tag jdk-26+3 for changeset 08b1fa4c - Merge tag 'jdk-26+2' into JDK-8357424 Added tag jdk-26+2 for changeset d7aa3498 - fix compilation error - address comments - Merge remote-tracking branch 'upstream/master' into JDK-8357424 - address comments - address comments - update copyright - [JVMCI] Avoid incrementing decompilation count for hosted compiled nmethod ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25356/files - new: https://git.openjdk.org/jdk/pull/25356/files/f253c0a8..fb32a8c7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25356&range=06-07 Stats: 18539 lines in 374 files changed: 10948 ins; 5514 del; 2077 mod Patch: https://git.openjdk.org/jdk/pull/25356.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25356/head:pull/25356 PR: https://git.openjdk.org/jdk/pull/25356 From mdoerr at openjdk.org Mon Jun 30 14:36:40 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 30 Jun 2025 14:36:40 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: <82BcN29PwKPICMZi2GDTyOfgtTyEim1vqPp6-fz6P_I=.6ba272c1-8b2f-4c47-8952-b8058c3a6828@github.com> On Mon, 30 Jun 2025 06:14:32 GMT, Amit Kumar wrote: >> Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. > > @adinn I got one test failure on s390: `test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java` > > > java.lang.RuntimeException: 1 < 2 > at jdk.test.lib.Asserts.fail(Asserts.java:715) > at MachCodeFramesInErrorFile.run(MachCodeFramesInErrorFile.java:170) > at MachCodeFramesInErrorFile.main(MachCodeFramesInErrorFile.java:108) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) > at java.base/java.lang.reflect.Method.invoke(Method.java:565) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) > at java.base/java.lang.Thread.run(Thread.java:1474) > > > I didn't hs_err even in full verbose. But attaching overall run in txt file: > [26004_test_failure.txt](https://github.com/user-attachments/files/20973513/26004_test_failure.txt) > @offamitkumar Thanks for testing. I'm looking into the issue and will get back to you. > > @TheRealMDoerr @offamitkumar @RealFYang Would you be able to check this on ppc/riscv to ensure it builds (cross-compile is ok) and passes (at least) tier1? > > Thanks for whatever help you can provide. hotspot tier1 has passed on PPC64. Thanks for the ping! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3019409026 From adinn at openjdk.org Mon Jun 30 15:18:40 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 30 Jun 2025 15:18:40 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: <82BcN29PwKPICMZi2GDTyOfgtTyEim1vqPp6-fz6P_I=.6ba272c1-8b2f-4c47-8952-b8058c3a6828@github.com> References: <82BcN29PwKPICMZi2GDTyOfgtTyEim1vqPp6-fz6P_I=.6ba272c1-8b2f-4c47-8952-b8058c3a6828@github.com> Message-ID: On Mon, 30 Jun 2025 14:34:11 GMT, Martin Doerr wrote: >> @adinn I got one test failure on s390: `test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java` >> >> >> java.lang.RuntimeException: 1 < 2 >> at jdk.test.lib.Asserts.fail(Asserts.java:715) >> at MachCodeFramesInErrorFile.run(MachCodeFramesInErrorFile.java:170) >> at MachCodeFramesInErrorFile.main(MachCodeFramesInErrorFile.java:108) >> at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) >> at java.base/java.lang.reflect.Method.invoke(Method.java:565) >> at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) >> at java.base/java.lang.Thread.run(Thread.java:1474) >> >> >> I didn't hs_err even in full verbose. But attaching overall run in txt file: >> [26004_test_failure.txt](https://github.com/user-attachments/files/20973513/26004_test_failure.txt) > >> @offamitkumar Thanks for testing. I'm looking into the issue and will get back to you. >> >> @TheRealMDoerr @offamitkumar @RealFYang Would you be able to check this on ppc/riscv to ensure it builds (cross-compile is ok) and passes (at least) tier1? >> >> Thanks for whatever help you can provide. > > hotspot tier1 has passed on PPC64. Thanks for the ping! @TheRealMDoerr Thanks for testing! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3019572059 From adinn at openjdk.org Mon Jun 30 15:18:39 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 30 Jun 2025 15:18:39 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 15:58:31 GMT, Andrew Dinn wrote: > Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. @vnkozlov My fastdebug build has passed the following test suites on Linux/aarch64 and Linux/x86: tier1 jtreg/test/hotspot/jtreg/runtime jtreg/test/hotspot/jtreg/compiler Could you review the PR and maybe run it through internal testing? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3019570928 From adinn at openjdk.org Mon Jun 30 15:27:39 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 30 Jun 2025 15:27:39 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Mon, 30 Jun 2025 06:14:32 GMT, Amit Kumar wrote: >> Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. > > @adinn I got one test failure on s390: `test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java` > > > java.lang.RuntimeException: 1 < 2 > at jdk.test.lib.Asserts.fail(Asserts.java:715) > at MachCodeFramesInErrorFile.run(MachCodeFramesInErrorFile.java:170) > at MachCodeFramesInErrorFile.main(MachCodeFramesInErrorFile.java:108) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) > at java.base/java.lang.reflect.Method.invoke(Method.java:565) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) > at java.base/java.lang.Thread.run(Thread.java:1474) > > > I didn't hs_err even in full verbose. But attaching overall run in txt file: > [26004_test_failure.txt](https://github.com/user-attachments/files/20973513/26004_test_failure.txt) @offamitkumar I'm somewhat puzzled by your test result. It indicates that the hserr file generated during the test contained a stack trace for the crash point listing two compiled Java method frames but that only one [MachCode][/MachCode] listing was generated into the file. That seems a bit odd since the other arches seem to generate at least 2 listings. I am seeing if I can reproduce the error on a borrowed s390 machine. One thing you might perhaps be able to try meanwhile to check what is going on is to run the test again with env var DEBUG set $ DEBUG=debug make test TEST=TEST=test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java This ought to force the contents of the hserr file to be written to the System.err which means they will be visible in file `build/linux-s390x-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_runtime_ErrorHandling_MachCodeFramesInErrorFile_java/runtime/ErrorHandling/MachCodeFramesInErrorFile.jtr` The output may get truncated but even then it still might give us some idea of what is going wrong. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3019608888 From adinn at openjdk.org Mon Jun 30 16:17:42 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 30 Jun 2025 16:17:42 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Mon, 30 Jun 2025 06:14:32 GMT, Amit Kumar wrote: >> Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. > > @adinn I got one test failure on s390: `test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java` > > > java.lang.RuntimeException: 1 < 2 > at jdk.test.lib.Asserts.fail(Asserts.java:715) > at MachCodeFramesInErrorFile.run(MachCodeFramesInErrorFile.java:170) > at MachCodeFramesInErrorFile.main(MachCodeFramesInErrorFile.java:108) > at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) > at java.base/java.lang.reflect.Method.invoke(Method.java:565) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335) > at java.base/java.lang.Thread.run(Thread.java:1474) > > > I didn't hs_err even in full verbose. But attaching overall run in txt file: > [26004_test_failure.txt](https://github.com/user-attachments/files/20973513/26004_test_failure.txt) @offamitkumar I managed to run the test on an s390x build and it passed. I didn't get any hserr output in the jtr file when I set DEBUG on the command line. So, I modified the test to write the hserr file to system.err unconditionially (i.e. I changed the if at line 137 of MachCodeFramesInErrorFile.java to 'if (true)'. The System.err output in the jtr file contained [MachCode] sections for 4 compiled methods, including the two that appeared in the stack listing (crashInJava3 and crashInJava2). So, I'm not sure why it is failing. n.b. not all the System.err output ends up in the jtr file because the test harness truncates excessive output. However, I wonder if the problem you are seeing is because the original hserr file is being truncated. It includes a message to that effect starting "Output overflow: ..." If that is the case and if you can reproduce the problem after modifying the test to write the hserr contents unconditionally then you ought to be able to see all the file contents by setting system property javatest.maxOutputSize to a suitable value (default is 100000). ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3019815902 From sviswanathan at openjdk.org Mon Jun 30 16:59:40 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 30 Jun 2025 16:59:40 GMT Subject: RFR: 8360776: Disable Intel APX by default and enable it only if requested by the user using -XX:+UnlockExperimentalVMOptions -XX:+UseAPX In-Reply-To: <36HotwhfTE2LRbNu1-KPRyspE4sNQB3hMxMo3eWmabY=.ac019cfe-5820-4afc-be53-3ed90a4381a6@github.com> References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> <36HotwhfTE2LRbNu1-KPRyspE4sNQB3hMxMo3eWmabY=.ac019cfe-5820-4afc-be53-3ed90a4381a6@github.com> Message-ID: On Mon, 30 Jun 2025 06:54:42 GMT, David Holmes wrote: >> Currently, APX is not enabled consistently between product and debug builds. >> >> If the hardware supports Intel APX: >> >> 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. >> >> 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. >> >> **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** > > src/hotspot/cpu/x86/vm_version_x86.cpp line 3155: > >> 3153: } >> 3154: // Enable APX support for product builds after >> 3155: // completion of planned features listed in JDK-8329030. > > So you have decided not to follow the original plan ( as JDK-8329030 is not complete) and instead go ahead and enable APX in product mode now. Why? Was this discussed anywhere? @dholmes-ora The remaining items in JDK-8329030 are nice to have. I will update JDK-8329030 accordingly to reflect this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26029#discussion_r2175521518 From sviswanathan at openjdk.org Mon Jun 30 17:09:40 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 30 Jun 2025 17:09:40 GMT Subject: RFR: 8360775: Fix Shenandoah GC test failures when APX is enabled In-Reply-To: References: <66o1iImVgzmTapY0AEGZeAg_VTj4ZbRc1MSFvgA8qYk=.ab11bc1a-3cb0-4832-82da-4e97ee8aaf9b@github.com> Message-ID: <4JWUyNwX9neYkTIymwpEpaXMig0AGc5ylQvPlWjqLR0=.5ffb5edb-3bb6-4d36-b3a3-820b4cafd704@github.com> On Mon, 30 Jun 2025 09:26:17 GMT, Jatin Bhateja wrote: >> This PR fixes the test failures seen in many JTreg tests related to Shenandoah GC (`test/hotspot/jtreg/gc/shenandoah/`) with UseAPX. The issues were root caused to: >> >> 1. Higher band registers are not saved and restored in Shenandoah load_reference_barrier. >> 2. Pusha/Popa implementation using push2p/pop2p >> >> Both the issues are fixed in this PR. > > src/hotspot/cpu/x86/assembler_x86.cpp line 15675: > >> 15673: void Assembler::pusha_uncached() { // 64bit >> 15674: if (UseAPX) { >> 15675: // Data being pushed by PUSH2 must be 16B-aligned on the stack, for this push rax upfront > > Hi @vamsi-parasa , > > PUSHA / POPA assembler is agnostic to the use of hardcoded registers in calling context, e.g. in following line of code > https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp#L495 > > If dst and tmp1 are RAX then we endup currpting it since RAX is used as a scratch register for stack alignment, and in case RAX holds an oop pointer then we may see random crashes. Such idioms are limited to GC barreirs currently, and we have recently fixed one such issue in https://github.com/openjdk/jdk/pull/25351 > > While the instruction sequence of PUSHA/ POPA with PPX hints is correct, Do you think for the time being we should limit the scope of this fix to save_machine_state and restor_machine_state routines rather than making generic fix in pusha/popa ? > > I have tried it and it's working. @jatin-bhateja Pusha is not expected to change any registers. The inadvertent change of registers is very hard to debug. So in my thoughts it is better to have a conservative implementation currently which doesn't change RAX register. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26009#discussion_r2175537445 From sparasa at openjdk.org Mon Jun 30 18:30:24 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 30 Jun 2025 18:30:24 GMT Subject: RFR: 8360776: Enable -XX:+UseAPX as experimental feature in all builds [v2] In-Reply-To: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Message-ID: <89lL3KxdJaMfke0irOMa24-3_ZbTifpXAXCDq1TGvNQ=.50ab6ae7-9d4a-426e-8832-a7692acbd921@github.com> > Currently, APX is not enabled consistently between product and debug builds. > > If the hardware supports Intel APX: > > 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. > > 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. > > **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** Srinivas Vamsi Parasa has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8360776: Enable -XX:+UseAPX as experimental feature in all builds ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26029/files - new: https://git.openjdk.org/jdk/pull/26029/files/fd3f346c..601ac00f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26029&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26029&range=00-01 Stats: 4 lines in 2 files changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26029.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26029/head:pull/26029 PR: https://git.openjdk.org/jdk/pull/26029 From sparasa at openjdk.org Mon Jun 30 18:30:24 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 30 Jun 2025 18:30:24 GMT Subject: RFR: 8360776: Enable -XX:+UseAPX as experimental feature in all builds In-Reply-To: References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Message-ID: On Sat, 28 Jun 2025 08:03:30 GMT, Jatin Bhateja wrote: > **Please change the title as "8360776: Enable -XX+UseAPX as Experiminatal feature in all builds"** > > We should also backport it to JDK-25 before RDP2 Please see the updated title as suggested. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26029#issuecomment-3020271154 From sparasa at openjdk.org Mon Jun 30 18:30:24 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 30 Jun 2025 18:30:24 GMT Subject: RFR: 8360776: Enable -XX:+UseAPX as experimental feature in all builds [v2] In-Reply-To: References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> Message-ID: On Sat, 28 Jun 2025 08:04:59 GMT, Jatin Bhateja wrote: >> Srinivas Vamsi Parasa has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> 8360776: Enable -XX:+UseAPX as experimental feature in all builds > > src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp line 432: > >> 430: } >> 431: >> 432: #if defined(_LP64) > > Is it still required after the removal of the 32-bit port of x86? Pls see the `#if defined(_LP64)` check removed in the updated code. > src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp line 258: > >> 256: } >> 257: >> 258: #if defined(_LP64) > > Do we still need this after removal of 32-bit port of x86 ? Pls see the `#if defined(_LP64)` check removed in the updated code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26029#discussion_r2175653338 PR Review Comment: https://git.openjdk.org/jdk/pull/26029#discussion_r2175653164 From kvn at openjdk.org Mon Jun 30 18:30:43 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 30 Jun 2025 18:30:43 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 15:58:31 GMT, Andrew Dinn wrote: > Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. I submitted testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26004#issuecomment-3020285461 From amenkov at openjdk.org Mon Jun 30 18:42:41 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Mon, 30 Jun 2025 18:42:41 GMT Subject: RFR: 8359870: JVM crashes in AccessInternal::PostRuntimeDispatch [v7] In-Reply-To: References: <0Q1PsFVW1d6lcQrKVhdLaNgHpLJXo2WusX6L9UOe5zo=.8903b8aa-57aa-4c5b-90c6-bd0f9c31de90@github.com> Message-ID: On Mon, 30 Jun 2025 11:24:28 GMT, Kevin Walls wrote: >> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native VM JavaThread from a java.lang.Thread. This is hard to reproduce but a thread that has since terminated can provoke a crash. Recognise this and return a null ThreadSnapshot. > > Kevin Walls has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: > > - remove test definition changes > - TLH: use cv_internal_thread_to_JavaThread() > - Merge remote-tracking branch 'upstream/master' into 8359870_threadexited > - Test requires: permit linux debug testing > - comment update > - comment update > - newline > - Test fails on minimal VM: require jvmti feature > - Correct THROW macro > - ThreadDumper thread count > - ... and 3 more: https://git.openjdk.org/jdk/compare/9337a35b...e2043438 Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25958#pullrequestreview-2972270226 From sviswanathan at openjdk.org Mon Jun 30 20:07:42 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 30 Jun 2025 20:07:42 GMT Subject: RFR: 8360776: Enable -XX:+UseAPX as experimental feature in all builds [v2] In-Reply-To: <89lL3KxdJaMfke0irOMa24-3_ZbTifpXAXCDq1TGvNQ=.50ab6ae7-9d4a-426e-8832-a7692acbd921@github.com> References: <0FxXpxiS5AsdMzFmEkhVGhWBPyB9H2n7N74iAoAmXdg=.5aca9c53-24d7-42ff-b1ed-072dd039ae1b@github.com> <89lL3KxdJaMfke0irOMa24-3_ZbTifpXAXCDq1TGvNQ=.50ab6ae7-9d4a-426e-8832-a7692acbd921@github.com> Message-ID: On Mon, 30 Jun 2025 18:30:24 GMT, Srinivas Vamsi Parasa wrote: >> Currently, APX is not enabled consistently between product and debug builds. >> >> If the hardware supports Intel APX: >> >> 1) In product builds, APX is disabled by default, even if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`. >> >> 2) In debug builds, APX is enabled by default regardless of whether the user explicitly enables it or not. >> >> **The goal of this PR is to enable APX for both product and debug builds if and only if the user explicitly enables it using `-XX:+UnlockExperimentalVMOptions -XX:+UseAPX`.** > > Srinivas Vamsi Parasa has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8360776: Enable -XX:+UseAPX as experimental feature in all builds Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26029#pullrequestreview-2972481212 From iklam at openjdk.org Mon Jun 30 20:09:40 2025 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 30 Jun 2025 20:09:40 GMT Subject: RFR: 8358680: AOT cache creation fails: no strings should have been added In-Reply-To: <1NT3dmBpabeSJ0HMglupev2ONGKaMH9XuMKZYiBwqZw=.32eb98c0-2ed1-46ef-b4b1-166e7d3f791d@github.com> References: <1NT3dmBpabeSJ0HMglupev2ONGKaMH9XuMKZYiBwqZw=.32eb98c0-2ed1-46ef-b4b1-166e7d3f791d@github.com> Message-ID: On Wed, 18 Jun 2025 17:27:29 GMT, Aleksey Shipilev wrote: >> Background: when writing the string table in the AOT cache, we do this: >> >> 1. Find out the number of strings in the interned string table >> 2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by `StringTable::lookup_shared()` in the production run. >> 3. Enter safepoint >> 4. Copy the strings into the arrays >> >> This bug happened because: >> >> - Step 1 is not thread safe, so it may be reading a stale version of `_items_count` >> - JIT compiler threads may create more interned strings after step 1 >> >> This PR attempts to fix both issues. > > I still dislike hooking up to compiler infrastructure to figure out if something is adding interned strings. I really, really dislike the divergence we would introduce with JDK 25 -> JDK 26 once a variant of [JDK-8357473](https://bugs.openjdk.org/browse/JDK-8357473) lands in mainline. I cannot yet think of better solution though, let me think about it some more. At very least we need to get the sequencing of patches right... Ping @shipilev ------------- PR Comment: https://git.openjdk.org/jdk/pull/25816#issuecomment-3020538804 From kvn at openjdk.org Mon Jun 30 20:30:40 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 30 Jun 2025 20:30:40 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 15:58:31 GMT, Andrew Dinn wrote: > Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. src/hotspot/cpu/x86/stubDeclarations_x86.hpp line 121: > 119: vector_byte_shuffle_mask, vector_byte_shuffle_mask) \ > 120: do_stub(compiler, vector_short_shuffle_mask) \ > 121: do_arch_entry(x86, compiler, vector_short_shuffle_mask, \ Was it bug? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26004#discussion_r2175837129 From sspitsyn at openjdk.org Mon Jun 30 20:42:39 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 30 Jun 2025 20:42:39 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: <_rEYQYLuKwA8su1Kc9Vi-m1kBUVieilE808P1RfkOj8=.7566a310-66d8-4db9-b4cf-7b6e7d742c56@github.com> References: <_rEYQYLuKwA8su1Kc9Vi-m1kBUVieilE808P1RfkOj8=.7566a310-66d8-4db9-b4cf-7b6e7d742c56@github.com> Message-ID: On Mon, 30 Jun 2025 05:53:59 GMT, Leonid Mesnik wrote: > I recently updated RunThese test to have more testing in this areas. As I understand you've added some specific stressing to reliably catch issues like this one. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26031#issuecomment-3020619772 From kvn at openjdk.org Mon Jun 30 20:49:39 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 30 Jun 2025 20:49:39 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 15:58:31 GMT, Andrew Dinn wrote: > Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. src/hotspot/share/runtime/stubCodeGenerator.hpp line 109: > 107: bool _print_code; > 108: BlobId _blob_id; > 109: protected: Please return spacing for `protected:` line. src/hotspot/share/runtime/stubCodeGenerator.hpp line 118: > 116: > 117: MacroAssembler* assembler() const { return _masm; } > 118: BlobId blob_id() { return _blob_id; } Align body of methods. May be move them to the left - I don't see why we have such big spacing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26004#discussion_r2175890898 PR Review Comment: https://git.openjdk.org/jdk/pull/26004#discussion_r2175887035 From sspitsyn at openjdk.org Mon Jun 30 20:50:38 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 30 Jun 2025 20:50:38 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: References: Message-ID: <63esK88MOSe9tGzxAaYXMexIoPU5ojz0O0KyYiFchSc=.5c2e0cbc-8fc2-4fd2-b633-204ea8f1080c@github.com> On Sat, 28 Jun 2025 05:02:56 GMT, Leonid Mesnik wrote: > The segv/eav happens in the case if JvmtiBreakpoint::_method's class redefined old between getting the Method* from jmethodid in the > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) {..} and > and actual setting breakpoint in the VM operation VM_ChangeBreakpoints. > > Here are details: > The breakpoint is set in 2 steps. > 1) method jvmti_SetBreakpoint(jvmtiEnv* env, jmethodID method, jlocation location) convert jmethodID to Method* and call > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) > where > JvmtiBreakpoint bp(method, location); > is created with this Method* > Note: it is done while thread is in VM state, so Method can't become is_old while this is done. > > 2) The VMOp is used to add breakpoint into the list > VM_ChangeBreakpoints set_breakpoint(VM_ChangeBreakpoints::SET_BREAKPOINT, &bp); > VMThread::execute(&set_breakpoint); > to call JvmtiBreakpoints::set_at_safepoint() > that can modify JvmtiBreakpoints list and set breakpoint in safepoint without synchronization. > > So it might be possible that class redefinition VM_RedefineClasses operation that redefine the class with this breakpoint happens between steps 1) and 2) > VM_RedefineClasses::redefine_single_class() > clear all class-related breakpoints in the JvmtiBreakpoints, however the "problematic" breakpoint is in VMThread queue and thus we are still continue to do this operation. > So in the step 2) the the JvmtiBreakpoint with 'is_old' method is added to the JvmtiBreakpoints and breakpoint is set. > > Then old method mights be purged any time once they are not on the stack and any access to this breakpoint could lead to usage of Metthod* _method pointing to deallocated metaspace. > > The VM_RedefineClasses clear all breakpoints so it is correct just to don't proceed with current breakpoint also. > > Looks, like very unlikely but reproducing with stress test after some time. > Verified that the crash is not reproduced anymore with corresponding test after the fix. > > Many thanks to Coleen for detailed explanation of class redefinition. The issue itself is kind of artificial. The debugger should not set breakpoints concurrently with class redefinitions. But suppose this can really happen. Then I have a little preference to fix this by setting a breakpoint after redefinition. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26031#issuecomment-3020644976 From kvn at openjdk.org Mon Jun 30 20:53:40 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 30 Jun 2025 20:53:40 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 15:58:31 GMT, Andrew Dinn wrote: > Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. src/hotspot/share/runtime/stubDeclarations.hpp line 1235: > 1233: do_arch_entry, do_arch_entry_init) \ > 1234: > 1235: No need this empty line. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26004#discussion_r2175905951 From kvn at openjdk.org Mon Jun 30 21:01:41 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 30 Jun 2025 21:01:41 GMT Subject: RFR: 8360707: Globally enumerate all blobs, stubs and entries In-Reply-To: References: Message-ID: On Thu, 26 Jun 2025 15:58:31 GMT, Andrew Dinn wrote: > Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently. I have few comments. src/hotspot/share/runtime/stubRoutines.cpp line 235: > 233: SharedRuntime::_jbyte_array_copy_ctr++; // Slow-path byte array copy > 234: #endif // !PRODUCT > 235: Copy::conjoint_jbytes_atomic(src, dest, count); Why you removed leading spaces here and in the following methods? src/hotspot/share/runtime/stubRoutines.cpp line 378: > 376: #define RETURN_STUB_PARM(xxx_arraycopy, parm) { \ > 377: name = parm ? #xxx_arraycopy "_uninit": #xxx_arraycopy; \ > 378: return StubRoutines::xxx_arraycopy(parm); } No need these spacing changes - it was fine. ------------- PR Review: https://git.openjdk.org/jdk/pull/26004#pullrequestreview-2972679521 PR Review Comment: https://git.openjdk.org/jdk/pull/26004#discussion_r2175920811 PR Review Comment: https://git.openjdk.org/jdk/pull/26004#discussion_r2175923345 From lmesnik at openjdk.org Mon Jun 30 21:15:40 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 30 Jun 2025 21:15:40 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: <63esK88MOSe9tGzxAaYXMexIoPU5ojz0O0KyYiFchSc=.5c2e0cbc-8fc2-4fd2-b633-204ea8f1080c@github.com> References: <63esK88MOSe9tGzxAaYXMexIoPU5ojz0O0KyYiFchSc=.5c2e0cbc-8fc2-4fd2-b633-204ea8f1080c@github.com> Message-ID: On Mon, 30 Jun 2025 20:47:35 GMT, Serguei Spitsyn wrote: > The issue itself is kind of artificial. The debugger should not set breakpoints concurrently with class redefinitions. But suppose this can really happen. Then I have a little preference to fix this by setting a breakpoint after redefinition when it is possible. That might happens if the another javaagent request retransformation independently from debugger agent. It indeed quite rare but might happens. I planned to change the fix to use methodHandle instead of jmehtodid. However it means that it is needed to reconstruct breakpoint with new method. I think it is too complicated and can't be well tested. Let me publish the second approach to decide what would be better to do with breakpoint. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26031#issuecomment-3020763227 From lmesnik at openjdk.org Mon Jun 30 21:20:55 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 30 Jun 2025 21:20:55 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint [v2] In-Reply-To: References: Message-ID: > The segv/eav happens in the case if JvmtiBreakpoint::_method's class redefined old between getting the Method* from jmethodid in the > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) {..} and > and actual setting breakpoint in the VM operation VM_ChangeBreakpoints. > > Here are details: > The breakpoint is set in 2 steps. > 1) method jvmti_SetBreakpoint(jvmtiEnv* env, jmethodID method, jlocation location) convert jmethodID to Method* and call > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) > where > JvmtiBreakpoint bp(method, location); > is created with this Method* > Note: it is done while thread is in VM state, so Method can't become is_old while this is done. > > 2) The VMOp is used to add breakpoint into the list > VM_ChangeBreakpoints set_breakpoint(VM_ChangeBreakpoints::SET_BREAKPOINT, &bp); > VMThread::execute(&set_breakpoint); > to call JvmtiBreakpoints::set_at_safepoint() > that can modify JvmtiBreakpoints list and set breakpoint in safepoint without synchronization. > > So it might be possible that class redefinition VM_RedefineClasses operation that redefine the class with this breakpoint happens between steps 1) and 2) > VM_RedefineClasses::redefine_single_class() > clear all class-related breakpoints in the JvmtiBreakpoints, however the "problematic" breakpoint is in VMThread queue and thus we are still continue to do this operation. > So in the step 2) the the JvmtiBreakpoint with 'is_old' method is added to the JvmtiBreakpoints and breakpoint is set. > > Then old method mights be purged any time once they are not on the stack and any access to this breakpoint could lead to usage of Metthod* _method pointing to deallocated metaspace. > > The VM_RedefineClasses clear all breakpoints so it is correct just to don't proceed with current breakpoint also. > > Looks, like very unlikely but reproducing with stress test after some time. > Verified that the crash is not reproduced anymore with corresponding test after the fix. > > Many thanks to Coleen for detailed explanation of class redefinition. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: changed jmethodid to methodHandle. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26031/files - new: https://git.openjdk.org/jdk/pull/26031/files/a6a9f01c..e1524bbd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26031&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26031&range=00-01 Stats: 16 lines in 2 files changed: 8 ins; 2 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/26031.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26031/head:pull/26031 PR: https://git.openjdk.org/jdk/pull/26031 From lmesnik at openjdk.org Mon Jun 30 21:20:55 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 30 Jun 2025 21:20:55 GMT Subject: RFR: 8359366: RunThese30M.java EXCEPTION_ACCESS_VIOLATION in JvmtiBreakpoints::clearall_in_class_at_safepoint In-Reply-To: References: Message-ID: On Sat, 28 Jun 2025 05:02:56 GMT, Leonid Mesnik wrote: > The segv/eav happens in the case if JvmtiBreakpoint::_method's class redefined old between getting the Method* from jmethodid in the > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) {..} and > and actual setting breakpoint in the VM operation VM_ChangeBreakpoints. > > Here are details: > The breakpoint is set in 2 steps. > 1) method jvmti_SetBreakpoint(jvmtiEnv* env, jmethodID method, jlocation location) convert jmethodID to Method* and call > JvmtiEnv::SetBreakpoint(Method* method, jlocation location) > where > JvmtiBreakpoint bp(method, location); > is created with this Method* > Note: it is done while thread is in VM state, so Method can't become is_old while this is done. > > 2) The VMOp is used to add breakpoint into the list > VM_ChangeBreakpoints set_breakpoint(VM_ChangeBreakpoints::SET_BREAKPOINT, &bp); > VMThread::execute(&set_breakpoint); > to call JvmtiBreakpoints::set_at_safepoint() > that can modify JvmtiBreakpoints list and set breakpoint in safepoint without synchronization. > > So it might be possible that class redefinition VM_RedefineClasses operation that redefine the class with this breakpoint happens between steps 1) and 2) > VM_RedefineClasses::redefine_single_class() > clear all class-related breakpoints in the JvmtiBreakpoints, however the "problematic" breakpoint is in VMThread queue and thus we are still continue to do this operation. > So in the step 2) the the JvmtiBreakpoint with 'is_old' method is added to the JvmtiBreakpoints and breakpoint is set. > > Then old method mights be purged any time once they are not on the stack and any access to this breakpoint could lead to usage of Metthod* _method pointing to deallocated metaspace. > > The VM_RedefineClasses clear all breakpoints so it is correct just to don't proceed with current breakpoint also. > > Looks, like very unlikely but reproducing with stress test after some time. > Verified that the crash is not reproduced anymore with corresponding test after the fix. > > Many thanks to Coleen for detailed explanation of class redefinition. The fix was updated to use methodHandle instead of jmethodID. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26031#issuecomment-3020794440 From dlong at openjdk.org Mon Jun 30 22:05:49 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 30 Jun 2025 22:05:49 GMT Subject: RFR: 8358821: patch_verified_entry causes problems, use nmethod entry barriers instead [v9] In-Reply-To: References: Message-ID: On Mon, 23 Jun 2025 19:26:11 GMT, Dean Long wrote: >> This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier. >> >> We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier. >> >> The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap. >> >> For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance. >> >> This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels. > > Dean Long has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into 8358821-patch-verified-entry > - 2nd try at arm fix > - rename arm_with to guard_with > - arm32 fix > - s390 fix courtesy of Amit Kumar > - remove is_sigill_not_entrant > - more cleanup > - more TheRealMDoerr suggestions > - TheRealMDoerr suggestions > - remove trailing space > - ... and 6 more: https://git.openjdk.org/jdk/compare/6df0f5e3...a39c458c I would be OK with JDK-8258229 being backed out. I sent a message to @mhaessig asking what he things. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25764#issuecomment-3020951767 From liach at openjdk.org Mon Jun 30 22:06:25 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 30 Jun 2025 22:06:25 GMT Subject: RFR: 8360163: Create annotations to mark dumping method handles and runtime setup required classes [v2] In-Reply-To: References: Message-ID: > I have updated this patch to avoid a redundant `runtimeSetup` annotation - we have agreed that the requirement for setup is a side effect of initialization, and such methods in AOTCI classes must be automatically recognized. This latest revision implements that model. > > I intentionally avoided handling Class and ClassLoader `resetArchivedStates` and `MethodType::assemblySetup` - we talked about a generic `assemblyCleanup` method, but I did not find out where is the best place to call such a method in the assembly phase. We cna handle this in a subsequent patch. > > In particular, please review the new AOT.md design document - I split it from the AOTCI annotation to prevent jamming; we can put general AOT information there when we have more AOT-specific annotations. > > --- > > Old description: > Currently, the list of classes that have interdependencies and those that need runtimeSetup are maintained in a hardcoded list in CDS. This makes it risky for core library developers as they might introduce new interdependencies and observe CDS to fail. By moving the mechanism of these lists to core library annotations as a first step, we can gradually expose the AOT contracts as program semantics described by internal annotations, and also helps us to explore how we can expose these functionalities to the public later. Chen Liang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 14 additional commits since the last revision: - Mark AbstractMap due to CHM - Try fix cpp code - Merge branch 'master' of https://github.com/openjdk/jdk into exp/cds-mh-anno - Scan runtimeSetup - Merge branch 'master' of https://github.com/openjdk/jdk into exp/cds-mh-anno - Name this AOTCI - Rename MHArchived to AotInitializable - Years - Can this fix? - Seems redundant - ... and 4 more: https://git.openjdk.org/jdk/compare/a904e14d...984ff268 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25922/files - new: https://git.openjdk.org/jdk/pull/25922/files/5403f602..984ff268 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25922&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25922&range=00-01 Stats: 8107 lines in 401 files changed: 3916 ins; 2681 del; 1510 mod Patch: https://git.openjdk.org/jdk/pull/25922.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25922/head:pull/25922 PR: https://git.openjdk.org/jdk/pull/25922 From liach at openjdk.org Mon Jun 30 22:06:26 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 30 Jun 2025 22:06:26 GMT Subject: RFR: 8360163: Create annotations to mark dumping method handles and runtime setup required classes In-Reply-To: References: Message-ID: <-xuyZAOCNWJ9s7LID1d8L4qGP1f-Dw7bW0RULCfmr3g=.3b1603d1-3fe2-477d-b517-881d4d74fd1a@github.com> On Sat, 21 Jun 2025 00:03:26 GMT, Chen Liang wrote: > I have updated this patch to avoid a redundant `runtimeSetup` annotation - we have agreed that the requirement for setup is a side effect of initialization, and such methods in AOTCI classes must be automatically recognized. This latest revision implements that model. > > I intentionally avoided handling Class and ClassLoader `resetArchivedStates` and `MethodType::assemblySetup` - we talked about a generic `assemblyCleanup` method, but I did not find out where is the best place to call such a method in the assembly phase. We cna handle this in a subsequent patch. > > In particular, please review the new AOT.md design document - I split it from the AOTCI annotation to prevent jamming; we can put general AOT information there when we have more AOT-specific annotations. > > --- > > Old description: > Currently, the list of classes that have interdependencies and those that need runtimeSetup are maintained in a hardcoded list in CDS. This makes it risky for core library developers as they might introduce new interdependencies and observe CDS to fail. By moving the mechanism of these lists to core library annotations as a first step, we can gradually expose the AOT contracts as program semantics described by internal annotations, and also helps us to explore how we can expose these functionalities to the public later. I have updated this patch to implement a newer model - see the latest description. Tested locally on linux-x64-debug runtime/cds. Submitted for more CI tests. In particular, please review the AOT.md design document. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25922#issuecomment-3020946458 From liach at openjdk.org Mon Jun 30 22:11:57 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 30 Jun 2025 22:11:57 GMT Subject: RFR: 8360163: Create annotations to mark dumping method handles and runtime setup required classes [v3] In-Reply-To: References: Message-ID: > I have updated this patch to avoid a redundant `runtimeSetup` annotation - we have agreed that the requirement for setup is a side effect of initialization, and such methods in AOTCI classes must be automatically recognized. This latest revision implements that model. > > I intentionally avoided handling Class and ClassLoader `resetArchivedStates` and `MethodType::assemblySetup` - we talked about a generic `assemblyCleanup` method, but I did not find out where is the best place to call such a method in the assembly phase. We cna handle this in a subsequent patch. > > In particular, please review the new AOT.md design document - I split it from the AOTCI annotation to prevent jamming; we can put general AOT information there when we have more AOT-specific annotations. > > --- > > Old description: > Currently, the list of classes that have interdependencies and those that need runtimeSetup are maintained in a hardcoded list in CDS. This makes it risky for core library developers as they might introduce new interdependencies and observe CDS to fail. By moving the mechanism of these lists to core library annotations as a first step, we can gradually expose the AOT contracts as program semantics described by internal annotations, and also helps us to explore how we can expose these functionalities to the public later. Chen Liang has updated the pull request incrementally with one additional commit since the last revision: Missed comment updates ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25922/files - new: https://git.openjdk.org/jdk/pull/25922/files/984ff268..9ec3d2e3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25922&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25922&range=01-02 Stats: 7 lines in 3 files changed: 3 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25922.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25922/head:pull/25922 PR: https://git.openjdk.org/jdk/pull/25922 From sparasa at openjdk.org Mon Jun 30 22:19:18 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 30 Jun 2025 22:19:18 GMT Subject: RFR: 8360775: Fix Shenandoah GC test failures when APX is enabled [v2] In-Reply-To: <66o1iImVgzmTapY0AEGZeAg_VTj4ZbRc1MSFvgA8qYk=.ab11bc1a-3cb0-4832-82da-4e97ee8aaf9b@github.com> References: <66o1iImVgzmTapY0AEGZeAg_VTj4ZbRc1MSFvgA8qYk=.ab11bc1a-3cb0-4832-82da-4e97ee8aaf9b@github.com> Message-ID: > This PR fixes the test failures seen in many JTreg tests related to Shenandoah GC (`test/hotspot/jtreg/gc/shenandoah/`) with UseAPX. The issues were root caused to: > > 1. Higher band registers are not saved and restored in Shenandoah load_reference_barrier. > 2. Pusha/Popa implementation using push2p/pop2p does not restore the contents of rax. > > Both the issues are fixed in this PR. Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: restore the orginal contents of rax ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26009/files - new: https://git.openjdk.org/jdk/pull/26009/files/2dba6e3f..de7c373f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26009&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26009&range=00-01 Stats: 70 lines in 1 file changed: 0 ins; 19 del; 51 mod Patch: https://git.openjdk.org/jdk/pull/26009.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26009/head:pull/26009 PR: https://git.openjdk.org/jdk/pull/26009 From sparasa at openjdk.org Mon Jun 30 22:19:18 2025 From: sparasa at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 30 Jun 2025 22:19:18 GMT Subject: RFR: 8360775: Fix Shenandoah GC test failures when APX is enabled [v2] In-Reply-To: <4JWUyNwX9neYkTIymwpEpaXMig0AGc5ylQvPlWjqLR0=.5ffb5edb-3bb6-4d36-b3a3-820b4cafd704@github.com> References: <66o1iImVgzmTapY0AEGZeAg_VTj4ZbRc1MSFvgA8qYk=.ab11bc1a-3cb0-4832-82da-4e97ee8aaf9b@github.com> <4JWUyNwX9neYkTIymwpEpaXMig0AGc5ylQvPlWjqLR0=.5ffb5edb-3bb6-4d36-b3a3-820b4cafd704@github.com> Message-ID: On Mon, 30 Jun 2025 17:07:04 GMT, Sandhya Viswanathan wrote: >> src/hotspot/cpu/x86/assembler_x86.cpp line 15675: >> >>> 15673: void Assembler::pusha_uncached() { // 64bit >>> 15674: if (UseAPX) { >>> 15675: // Data being pushed by PUSH2 must be 16B-aligned on the stack, for this push rax upfront >> >> Hi @vamsi-parasa , >> >> PUSHA / POPA assembler is agnostic to the use of hardcoded registers in calling context, e.g. in following line of code >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp#L495 >> >> If dst and tmp1 are RAX then we endup currpting it since RAX is used as a scratch register for stack alignment, and in case RAX holds an oop pointer then we may see random crashes. Such idioms are limited to GC barreirs currently, and we have recently fixed one such issue in https://github.com/openjdk/jdk/pull/25351 >> >> While the instruction sequence of PUSHA/ POPA with PPX hints is correct, Do you think for the time being we should limit the scope of this fix to save_machine_state and restor_machine_state routines rather than making generic fix in pusha/popa ? >> >> I have tried it and it's working. > > @jatin-bhateja Pusha is not expected to change any registers. The inadvertent change of registers is very hard to debug. So in my thoughts it is better to have a conservative implementation currently which doesn't change RAX register. Please see the updated code which fixes the issue by restoring the contents of RAX. The tests are passing with this update. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26009#discussion_r2176059486 From lmesnik at openjdk.org Mon Jun 30 22:55:38 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 30 Jun 2025 22:55:38 GMT Subject: RFR: 8359359: AArch64: share trampolines between static calls to the same method In-Reply-To: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> References: <3mB1bU-08ZvsJkR52D-nNFObwsaysNHxBkF1L42lmIY=.459e1ba8-118e-4a3a-8049-415765d25553@github.com> Message-ID: On Tue, 24 Jun 2025 15:38:00 GMT, Mikhail Ablakatov wrote: > Modify the C2 compiler to share trampoline stubs between static calls that resolve to the same callee method. Since the relocation target for all static calls is initially set to the static call resolver stub, the call's target alone cannot be used to distinguish between different static method calls. Instead, trampoline stubs should be shared based on the actual callee. > > The `SharedTrampolineTest.java` was designed to verify the sharing of trampolines among static calls. However, due to imprecise log analysis, the test currently passes even when trampolines are not shared. Additionally, comments within the test suggest ambiguity regarding whether it was intended to assess trampoline sharing for static calls or runtime calls. To address these issues and eliminate ambiguity, this patch renames and updates the existing test. Furthermore, a new test is introduced, using the existing one as a foundation, to accurately evaluate trampoline sharing for both static and runtime calls. > > This has passed tier1-3 and jcstress testing on AArch64. Changes requested by lmesnik (Reviewer). test/hotspot/jtreg/compiler/sharedstubs/SharedStaticCallTrampolineTest.java line 33: > 31: * > 32: * @requires vm.compiler2.enabled > 33: * @requires vm.opt.TieredCompilation == null I don't think @requires vm.opt.TieredCompilation == null is needed here. Test always overrides TieredCompilation mode. The problem is that the test is going to be skipped if someone run testing with -XX:-TieredCompilation to test C2 changes. Please just remove this line so test is executed anytime if C2 is available. The renamed test 'SharedRuntimeCallTrampolineTest' seems to have this problem also. ------------- PR Review: https://git.openjdk.org/jdk/pull/25954#pullrequestreview-2972999150 PR Review Comment: https://git.openjdk.org/jdk/pull/25954#discussion_r2176106769