From dholmes at openjdk.org Tue Apr 1 06:03:47 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 1 Apr 2025 06:03:47 GMT Subject: RFR: 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" Message-ID: See bug report for gory details. Short version: in the Windows version of `VirtualMachineImpl::execute`, if an exception occurred after we created the `SocketInputStreamImpl` (which is the test scenario of the failing test), we would close the native `HANDLE` to the pipe twice. But after the first close the `HANDLE` could be reassigned to another object (e.g. the `_ParkHandle` of the `StreamPumper` thread) and the second close would close that `HANDLE` resulting in the failure of `WaitForSingleObject`. (Other failure modes with different invalid handles have also been seen.) Fix: shorten the outer try/catch block so that we only directly close the pipe if the `IOException` happens before we create the `SocketStreamImpl` - after which the closing of the stream will close the pipe `HANDLE`. Testing: - ran the com/sun/tools/attach tets group 2500 times without failure - tiers 3-5 as a sanity check (Windows only) Thanks ------------- Commit messages: - 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" Changes: https://git.openjdk.org/jdk/pull/24346/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24346&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323100 Stats: 26 lines in 1 file changed: 14 ins; 11 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24346.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24346/head:pull/24346 PR: https://git.openjdk.org/jdk/pull/24346 From kevinw at openjdk.org Tue Apr 1 09:11:21 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 1 Apr 2025 09:11:21 GMT Subject: RFR: 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 05:57:55 GMT, David Holmes wrote: > See bug report for gory details. > > Short version: in the Windows version of `VirtualMachineImpl::execute`, if an exception occurred after we created the `SocketInputStreamImpl` (which is the test scenario of the failing test), we would close the native `HANDLE` to the pipe twice. But after the first close the `HANDLE` could be reassigned to another object (e.g. the `_ParkHandle` of the `StreamPumper` thread) and the second close would close that `HANDLE` resulting in the failure of `WaitForSingleObject`. (Other failure modes with different invalid handles have also been seen.) > > Fix: shorten the outer try/catch block so that we only directly close the pipe if the `IOException` happens before we create the `SocketStreamImpl` - after which the closing of the stream will close the pipe `HANDLE`. > > Testing: > - ran the com/sun/tools/attach tets group 2500 times without failure > - tiers 3-5 as a sanity check (Windows only) > > Thanks Excellent to have this found, they just looked like an "impossible" failures. 8-) Also good lessons in naming and commenting around processCompletionStatus. ------------- Marked as reviewed by kevinw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24346#pullrequestreview-2732025593 From varadam at openjdk.org Tue Apr 1 10:30:56 2025 From: varadam at openjdk.org (Varada M) Date: Tue, 1 Apr 2025 10:30:56 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v4] In-Reply-To: References: Message-ID: > AIX changes for attach API to support arbitrary length arguments and the streaming output support. > serviceability/attach/AttachAPIv2/StreamingOutputTest.java test passes > > tier1, tier2 and tier3 testing is successful with fastdebug level > > JBS Issue : [JDK-8352392](https://bugs.openjdk.org/browse/JDK-8352392) Varada M has updated the pull request incrementally with one additional commit since the last revision: 8352392: AIX: implement attach API v2 and streaming output ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24177/files - new: https://git.openjdk.org/jdk/pull/24177/files/d22f0ab9..234f6d17 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24177&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24177&range=02-03 Stats: 20 lines in 2 files changed: 1 ins; 0 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/24177.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24177/head:pull/24177 PR: https://git.openjdk.org/jdk/pull/24177 From varadam at openjdk.org Tue Apr 1 10:35:20 2025 From: varadam at openjdk.org (Varada M) Date: Tue, 1 Apr 2025 10:35:20 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v3] In-Reply-To: References: <4PJ92vw13If5UwOGWdLxkM0uaUiOa4GqogzCB37xSLk=.2b0950bb-937f-408d-9c69-31dd06e391d3@github.com> Message-ID: On Mon, 31 Mar 2025 10:16:31 GMT, Martin Doerr wrote: >> Varada M has updated the pull request incrementally with two additional commits since the last revision: >> >> - updated copyright header >> - removed StreamingOutputTest.java from problem list > > src/jdk.attach/aix/classes/sun/tools/attach/VirtualMachineImpl.java line 196: > >> 194: } >> 195: } >> 196: > > Same here. Hi Martin, I have added the indentation fix and comment ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24177#discussion_r2022594632 From kevinw at openjdk.org Tue Apr 1 10:39:28 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 1 Apr 2025 10:39:28 GMT Subject: RFR: 8353231: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently Message-ID: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently. On failure, 10 attempts with sleep(200) each time, only read -1 from mbean.getProcessCpuLoad(). The method is documented to return -1 when info is not available, but want to avoid the test accepting a -1 and masking real problems. Test failures are happening when multiple CPU load reding tests ran on the same host, at the same second. Add a TEST.properties file containing: exclusiveAccess.dirs=. ------------- Commit messages: - 8353231: Test com/sun/management/OperatingSystemMXBean cpuLoad still fails intermittently Changes: https://git.openjdk.org/jdk/pull/24352/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24352&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353231 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24352/head:pull/24352 PR: https://git.openjdk.org/jdk/pull/24352 From ihse at openjdk.org Tue Apr 1 12:02:19 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 1 Apr 2025 12:02:19 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Tue, 11 Feb 2025 15:56:39 GMT, Matthias Baesken wrote: > The libjdwp is currently built with LOW optimization level, it could be built with SIZE optimization to lower the lib size by ~ 10 % on UNIX. > On Windows LOW and SIZE currently translate to the same O1 optimization flag so no difference there. > > On Linux x86_64 for example the lib shrinks from > 300K to 268K and the debuginfo file shrinks from 1.9M to 1.7M . > > On Linux ppc64le for example the lib shrinks from > 428K to 368K and the debuginfo file shrinks from 2.0M to 1.7M . It would be interesting to also see how compilation times varies with optimization level. At least some kind of hint if HIGHEST is like 2x slower than LOW, or if SIZE is slower than LOW at all, etc. The relative speed difference is interesting, but so is it in absolute terms. If a library takes 0.5 seconds on LOW but 1.1 seconds on HIGH on a particular system, it is unlikely to matter much to overall build time anywhere. But if it goes from 15s to 30s on a fast machine, it might be a problem if such performance regressions stack up, especially on slower machines (which includes the ones running GHA). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2769121979 From dholmes at openjdk.org Tue Apr 1 12:03:23 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 1 Apr 2025 12:03:23 GMT Subject: RFR: 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 09:08:06 GMT, Kevin Walls wrote: >> See bug report for gory details. >> >> Short version: in the Windows version of `VirtualMachineImpl::execute`, if an exception occurred after we created the `SocketInputStreamImpl` (which is the test scenario of the failing test), we would close the native `HANDLE` to the pipe twice. But after the first close the `HANDLE` could be reassigned to another object (e.g. the `_ParkHandle` of the `StreamPumper` thread) and the second close would close that `HANDLE` resulting in the failure of `WaitForSingleObject`. (Other failure modes with different invalid handles have also been seen.) >> >> Fix: shorten the outer try/catch block so that we only directly close the pipe if the `IOException` happens before we create the `SocketStreamImpl` - after which the closing of the stream will close the pipe `HANDLE`. >> >> Testing: >> - ran the com/sun/tools/attach tets group 2500 times without failure >> - tiers 3-5 as a sanity check (Windows only) >> >> Thanks > > Excellent to have this found, they just looked like an "impossible" failures. 8-) > > Also good lessons in naming and commenting around processCompletionStatus. Thanks for the review @kevinjwalls ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24346#issuecomment-2769126946 From mdoerr at openjdk.org Tue Apr 1 12:22:20 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 1 Apr 2025 12:22:20 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v3] In-Reply-To: References: <4PJ92vw13If5UwOGWdLxkM0uaUiOa4GqogzCB37xSLk=.2b0950bb-937f-408d-9c69-31dd06e391d3@github.com> Message-ID: On Tue, 1 Apr 2025 10:32:55 GMT, Varada M wrote: >> src/jdk.attach/aix/classes/sun/tools/attach/VirtualMachineImpl.java line 196: >> >>> 194: } >>> 195: } >>> 196: >> >> Same here. > > Hi Martin, I have added the indentation fix and comment Thanks! I think it's good. Let's wait for @JoKern65 's review. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24177#discussion_r2022743745 From sgehwolf at openjdk.org Tue Apr 1 14:59:37 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 1 Apr 2025 14:59:37 GMT Subject: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v17] In-Reply-To: References: Message-ID: > Please review this fix for cgroups-based metrics reporting in the `jdk.internal.platform` package. This fix is supposed to address wrong reporting of certain limits if the limits aren't set at the leaf nodes. > > For example, on cg v2, the memory limit interface file is `memory.max`. Consider a cgroup path of `/a/b/c/d`. The current code only reports the limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, some systems - like a systemd slice - sets those limits further up the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report the actual limit value (e.g. `1048576000`). > > This patch addresses this issue by: > > 1. Refactoring the interface lookup code to relevant controllers for cpu/memory. The CgroupSubsystem classes then delegate to those for the lookup. This facilitates having an API for the lookup of an updated limit in step 2. > 2. Walking the full hierarchy of the cgroup path (if any), looking for a lower limit than at the leaf. Note that it's not possible to raise the limit set at a path closer to the root via the interface file at a further-to-the-leaf-level. The odd case out seems to be `max` values on some systems (which seems to be the default value). > > As an optimization this hierarchy walk is skipped on containerized systems (like K8S), where the limits are set in interface files at the leaf nodes of the hierarchy. Therefore there should be no change on those systems. > > This patch depends on the Hotspot change implementing the same for the JVM so that `Metrics.isContainerized()` works correctly on affected systems where `-XshowSettings:system` currently reports `System not containerized` due to the missing JVM fix. A test framework for such hierarchical systems has been added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds a test using that framework among some simpler unit tests. > > Thoughts? > > Testing: > > - [x] GHA > - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems > - [x] Some manual testing using systemd slices Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 44 commits: - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - JDK-8350103 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - Fix missing imports - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - Merge branch 'master' into jdk-8336881-metrics-systemd-slice - ... and 34 more: https://git.openjdk.org/jdk/compare/6801eb87...32960cd6 ------------- Changes: https://git.openjdk.org/jdk/pull/20280/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=16 Stats: 1621 lines in 27 files changed: 1373 ins; 152 del; 96 mod Patch: https://git.openjdk.org/jdk/pull/20280.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280 PR: https://git.openjdk.org/jdk/pull/20280 From stefank at openjdk.org Tue Apr 1 15:32:17 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 1 Apr 2025 15:32:17 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock In-Reply-To: References: Message-ID: On Mon, 31 Mar 2025 14:09:22 GMT, Robert Toyonaga wrote: > OK should I update this PR to do the following things: > > * Add comments explaining the asymmetrical locking and warning against patterns that lead to races Sounds like a good idea. > > * swapping the order of `NmtVirtualMemoryLocker` and release/uncommit I wonder if this should be done as new RFE after the change below. It might need a bit of investigation to make sure that the reasoning around this is correct. > > * Fail fatally if release/uncommit does not complete. I think this would be a good, separate RFE to be done before we try to swap the order. > > > Or does it make more sense to do that in a different issue/PR? > > Also, do we want to keep the new tests and the refactorings (see below)? > > ``` > if (MemTracker::enabled()) { > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_some_operation(addr, bytes); > if (result != nullptr) { > MemTracker::record_some_operation(addr, bytes); > } > } else { > result = pd_unmap_memory(addr, bytes); > } > ``` > > To: > > ``` > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_unmap_memory(addr, bytes); > MemTracker::record_some_operation(addr, bytes); > ``` My thinking is that after you done (2) above, then you will not need to expose the NMT lock to this level. The code would be: MemTracker::record_some_operation(addr, bytes); // Lock confined inside this pd_unmap_memory(addr, bytes); So, I would wait with this cleanup until we know more about (2). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2769766908 From egor.ushakov at jetbrains.com Tue Apr 1 17:14:43 2025 From: egor.ushakov at jetbrains.com (Egor Ushakov) Date: Tue, 1 Apr 2025 19:14:43 +0200 Subject: Debugger overhead for virtual threads creation Message-ID: Hi everyone! Is it expected that with the debugger attached creating virtual threads is much slower? We're getting bugs like: https://youtrack.jetbrains.com/issue/IDEA-365900 And I can reproduce it easily with jdb... Just attaching the debugger immediately slows down virtual threads creation significantly. >java -agentlib:jdwp=transport=dt_shmem,server=y,suspend=n,address=8000 app ... 6808805 (1.2046688E7 threads per second) ... after >jdb -attach 8000 ... 30215 (95986.055 threads per second) ... Thanks, Egor From amenkov at openjdk.org Tue Apr 1 18:38:33 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 1 Apr 2025 18:38:33 GMT Subject: RFR: 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 05:57:55 GMT, David Holmes wrote: > See bug report for gory details. > > Short version: in the Windows version of `VirtualMachineImpl::execute`, if an exception occurred after we created the `SocketInputStreamImpl` (which is the test scenario of the failing test), we would close the native `HANDLE` to the pipe twice. But after the first close the `HANDLE` could be reassigned to another object (e.g. the `_ParkHandle` of the `StreamPumper` thread) and the second close would close that `HANDLE` resulting in the failure of `WaitForSingleObject`. (Other failure modes with different invalid handles have also been seen.) > > Fix: shorten the outer try/catch block so that we only directly close the pipe if the `IOException` happens before we create the `SocketStreamImpl` - after which the closing of the stream will close the pipe `HANDLE`. > > Testing: > - ran the com/sun/tools/attach tets group 2500 times without failure > - tiers 3-5 as a sanity check (Windows only) > > Thanks Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24346#pullrequestreview-2733724000 From cjplummer at openjdk.org Tue Apr 1 20:00:22 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 1 Apr 2025 20:00:22 GMT Subject: RFR: 8352088: Call of com.sun.jdi.ThreadReference.threadGroups() can lock up target VM [v7] In-Reply-To: References: Message-ID: > Calling ThreadGroupReference.groups() from an event handler can cause a deadlock. Details in first comment. Tested with :jdk_lang on all supported platforms and tier1, tier2, tier3, and tier5 svc testing. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: minor comment update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24236/files - new: https://git.openjdk.org/jdk/pull/24236/files/977ecf15..80f75c60 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24236&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24236&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24236.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24236/head:pull/24236 PR: https://git.openjdk.org/jdk/pull/24236 From sspitsyn at openjdk.org Tue Apr 1 20:36:25 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 1 Apr 2025 20:36:25 GMT Subject: RFR: 8352088: Call of com.sun.jdi.ThreadReference.threadGroups() can lock up target VM [v7] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 20:00:22 GMT, Chris Plummer wrote: >> Calling ThreadGroupReference.groups() from an event handler can cause a deadlock. Details in first comment. Tested with :jdk_lang on all supported platforms and tier1, tier2, tier3, and tier5 svc testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > minor comment update Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24236#pullrequestreview-2733985461 From amenkov at openjdk.org Tue Apr 1 21:32:35 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 1 Apr 2025 21:32:35 GMT Subject: RFR: 8353479: jcmd with streaming output breaks intendation Message-ID: `outputStream` implementations should call `update_position` from `write` to correctly handle indentation. The fix adds the call to `attachStream::write` testing: sanity tier1; in progress: tier2..4,hs-tier5-svc ------------- Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/24368/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24368&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353479 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24368.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24368/head:pull/24368 PR: https://git.openjdk.org/jdk/pull/24368 From amenkov at openjdk.org Tue Apr 1 21:32:35 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 1 Apr 2025 21:32:35 GMT Subject: RFR: 8353479: jcmd with streaming output breaks intendation In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 21:23:01 GMT, Alex Menkov wrote: > `outputStream` implementations should call `update_position` from `write` to correctly handle indentation. > The fix adds the call to `attachStream::write` > > testing: sanity tier1; > in progress: tier2..4,hs-tier5-svc Example of output with and without streaming output: $ JAVA_TOOL_OPTIONS=-Djdk.attach.allowStreamingOutput=true jcmd 31544 VM.native_memory Picked up JAVA_TOOL_OPTIONS: -Djdk.attach.allowStreamingOutput=true 31544: Native Memory Tracking: (Omitting categories weighting less than 1KB) Total: reserved=9953148KB, committed=407888KB malloc: 57776KB #293244, peak=75433KB #212390 mmap: reserved=9895372KB, committed=350112KB - Java Heap (reserved=8298496KB, committed=270336KB) (mmap: reserved=8298496KB, committed=270336KB, peak=520192KB) - Class (reserved=1048938KB, committed=2922KB) (classes #3716) ( instance classes #3406, array classes #310) (malloc=362KB #7060 ) (peak=364KB #7068) (mmap: reserved=1048576KB, committed=2560KB, at peak) ( Metadata: ) ( reserved=65536KB, committed=24384KB ) ( used=24223KB) ( waste=161KB =0.66%) ( Class space:) ( reserved=1048576KB, committed=2560KB ) ( used=2423KB) ( waste=137KB =5.36%) $ jcmd 31544 VM.native_memory 31544: Native Memory Tracking: (Omitting categories weighting less than 1KB) Total: reserved=9953152KB, committed=407892KB malloc: 57780KB #293282, peak=75433KB #212390 mmap: reserved=9895372KB, committed=350112KB - Java Heap (reserved=8298496KB, committed=270336KB) (mmap: reserved=8298496KB, committed=270336KB, peak=520192KB) - Class (reserved=1048938KB, committed=2922KB) (classes #3716) ( instance classes #3406, array classes #310) (malloc=362KB #7060) (peak=364KB #7068) (mmap: reserved=1048576KB, committed=2560KB, at peak) ( Metadata: ) ( reserved=65536KB, committed=24384KB) ( used=24223KB) ( waste=161KB =0.66%) ( Class space:) ( reserved=1048576KB, committed=2560KB) ( used=2423KB) ( waste=137KB =5.36%) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24368#issuecomment-2770724520 From chris.plummer at oracle.com Tue Apr 1 23:39:07 2025 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 1 Apr 2025 16:39:07 -0700 Subject: Debugger overhead for virtual threads creation In-Reply-To: References: Message-ID: The short answer is yes. The debug agent needs to deal with JVMTI_EVENT_VIRTUAL_THREAD_START/END events for every virtual thread. What makes it worse is when there are a large number of virtual threads that are currently alive. They are tracked on a list of ThreadNodes that starts to slow down debug agent performance when it gets too long. I have a work in progress that proactively purges these ThreadNodes so the list does not get too big.?I've been meaning to revive this project for quite some time. If you have a test case I'd be willing to experiment with these changes some more. I could not access to the IDEA-365900 link you provided. Note I think after the work is done to purge ThreadNodes proactively it might not be that hard of step to move to not needing JVMTI_EVENT_VIRTUAL_THREAD_START/END events enabled, which will help performance a lot more. Chris On 4/1/25 10:14 AM, Egor Ushakov wrote: > Hi everyone! > > Is it expected that with the debugger attached creating virtual > threads is much slower? > We're getting bugs like: https://youtrack.jetbrains.com/issue/IDEA-365900 > And I can reproduce it easily with jdb... > Just attaching the debugger immediately slows down virtual threads > creation significantly. > > >java > -agentlib:jdwp=transport=dt_shmem,server=y,suspend=n,address=8000 app > ... > 6808805 (1.2046688E7 threads per second) > ... > after >jdb -attach 8000 > ... > 30215 (95986.055 threads per second) > ... > > Thanks, > Egor From jpai at openjdk.org Wed Apr 2 01:40:31 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Wed, 2 Apr 2025 01:40:31 GMT Subject: RFR: 8352088: Call of com.sun.jdi.ThreadReference.threadGroups() can lock up target VM [v7] In-Reply-To: References: Message-ID: <5kETU88L5YqVVQgMSB7QDo8heC9-R4Ia_JhXt2ZkfVQ=.2ce9a39e-ad21-433d-b724-de1ceb3336f5@github.com> On Tue, 1 Apr 2025 20:00:22 GMT, Chris Plummer wrote: >> Calling ThreadGroupReference.groups() from an event handler can cause a deadlock. Details in first comment. Tested with :jdk_lang on all supported platforms and tier1, tier2, tier3, and tier5 svc testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > minor comment update Marked as reviewed by jpai (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24236#pullrequestreview-2734354863 From dholmes at openjdk.org Wed Apr 2 03:00:17 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 2 Apr 2025 03:00:17 GMT Subject: RFR: 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 18:35:36 GMT, Alex Menkov wrote: >> See bug report for gory details. >> >> Short version: in the Windows version of `VirtualMachineImpl::execute`, if an exception occurred after we created the `SocketInputStreamImpl` (which is the test scenario of the failing test), we would close the native `HANDLE` to the pipe twice. But after the first close the `HANDLE` could be reassigned to another object (e.g. the `_ParkHandle` of the `StreamPumper` thread) and the second close would close that `HANDLE` resulting in the failure of `WaitForSingleObject`. (Other failure modes with different invalid handles have also been seen.) >> >> Fix: shorten the outer try/catch block so that we only directly close the pipe if the `IOException` happens before we create the `SocketStreamImpl` - after which the closing of the stream will close the pipe `HANDLE`. >> >> Testing: >> - ran the com/sun/tools/attach tets group 2500 times without failure >> - tiers 3-5 as a sanity check (Windows only) >> >> Thanks > > Marked as reviewed by amenkov (Reviewer). Thanks for the review @alexmenkov ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24346#issuecomment-2771215407 From dholmes at openjdk.org Wed Apr 2 03:00:18 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 2 Apr 2025 03:00:18 GMT Subject: Integrated: 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" In-Reply-To: References: Message-ID: <6XUN6YuzWuA_WbJMP4CzRUyfFJr3aeozRTUcXJe1pFo=.451b7623-8873-4d35-a826-1c22665f39db@github.com> On Tue, 1 Apr 2025 05:57:55 GMT, David Holmes wrote: > See bug report for gory details. > > Short version: in the Windows version of `VirtualMachineImpl::execute`, if an exception occurred after we created the `SocketInputStreamImpl` (which is the test scenario of the failing test), we would close the native `HANDLE` to the pipe twice. But after the first close the `HANDLE` could be reassigned to another object (e.g. the `_ParkHandle` of the `StreamPumper` thread) and the second close would close that `HANDLE` resulting in the failure of `WaitForSingleObject`. (Other failure modes with different invalid handles have also been seen.) > > Fix: shorten the outer try/catch block so that we only directly close the pipe if the `IOException` happens before we create the `SocketStreamImpl` - after which the closing of the stream will close the pipe `HANDLE`. > > Testing: > - ran the com/sun/tools/attach tets group 2500 times without failure > - tiers 3-5 as a sanity check (Windows only) > > Thanks This pull request has now been integrated. Changeset: e6fe2490 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/e6fe2490bc48acf01ccf81b38d578d20ed09f238 Stats: 26 lines in 1 file changed: 14 ins; 11 del; 1 mod 8323100: com/sun/tools/attach/StartManagementAgent.java failed with "WaitForSingleObject failed" Reviewed-by: kevinw, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24346 From alanb at openjdk.org Wed Apr 2 06:13:11 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 2 Apr 2025 06:13:11 GMT Subject: RFR: 8352088: Call of com.sun.jdi.ThreadReference.threadGroups() can lock up target VM [v7] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 20:00:22 GMT, Chris Plummer wrote: >> Calling ThreadGroupReference.groups() from an event handler can cause a deadlock. Details in first comment. Tested with :jdk_lang on all supported platforms and tier1, tier2, tier3, and tier5 svc testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > minor comment update Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24236#pullrequestreview-2734755558 From sspitsyn at openjdk.org Wed Apr 2 08:37:22 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 2 Apr 2025 08:37:22 GMT Subject: RFR: 8353479: jcmd with streaming output breaks intendation In-Reply-To: References: Message-ID: <284gTOn3RoFPKIgLIzsrEPF5KGoJ2Ci98oW2pb9o7nU=.7a6d5b14-a058-4f51-8693-571c07d05d1d@github.com> On Tue, 1 Apr 2025 21:23:01 GMT, Alex Menkov wrote: > `outputStream` implementations should call `update_position` from `write` to correctly handle indentation. > The fix adds the call to `attachStream::write` > > testing: sanity tier1; > in progress: tier2..4,hs-tier5-svc Looks okay to me. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24368#pullrequestreview-2735241312 From kevinw at openjdk.org Wed Apr 2 08:43:22 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 2 Apr 2025 08:43:22 GMT Subject: RFR: 8344671: Few JFR streaming tests fail with application not alive error on MacOS 15 [v5] In-Reply-To: References: <3xUroXKNX4bBRb0L4r5WJ9V_TEJRbtS_hmdZ3AMCTFo=.86aaf7a8-d2c1-4f07-9f74-4e2cab2d0fa2@github.com> Message-ID: On Mon, 31 Mar 2025 18:02:12 GMT, Larry Cable wrote: >> on both Linux and MacOS libattach utilizes UNIX signal (QUIT) to cause a target JVM (attachee) to create the socket file used as transport for subsequent jcmds (and other attach based interactions) and to listen upon that for such. >> >> it should be noted that the default behavior for QUIT (if not blocked or caught) is to terminate the signalled process. >> >> during the early lifetime of a JVM, its signal handlers are not yet installed, and thus any signal such as QUIT will cause the >> default behavior to occur, in this case the JVM will be terminated. >> >> this is why some tests are failing with "not alive" >> >> the "fix" is similar in nature to that already implemented for linux (however using a different OS dependent mechanism to obtain the attachee JVM's signal masks: sysctl(2)). >> >> the method "checkCatchesAndSendQuitTo" will now obtain the "attachee" JVM signal masks and only kill(QUIT) if the >> current masks indicate that the JVM's signals are now being handled. >> >> the behavior in the success case is now identical to the previous implementation, however should the target JVM not >> become "ready" (signal handlers installed) prior to the attach "timeout" occurring the attach operation will throw an >> "AttachNotSupportedException" with a suitable error message. >> >> see also: https://bugs.openjdk.org/browse/JDK-8350766 > > Larry Cable has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch 'JDK-8344671' of github.com:larry-cable/jdk into JDK-8344671 > - JDK-8334671: minor changes requested by @dholmes Looks good. (I'm sure the Linux and MacOS VirtualMachineImpl() attach/timeout loops are almost identical now, not a problem but maybe one day we can de-duplicate them.) ------------- Marked as reviewed by kevinw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24085#pullrequestreview-2735291287 From egor.ushakov at jetbrains.com Wed Apr 2 10:32:03 2025 From: egor.ushakov at jetbrains.com (Egor Ushakov) Date: Wed, 2 Apr 2025 12:32:03 +0200 Subject: Debugger overhead for virtual threads creation In-Reply-To: References: Message-ID: <5451248a-ded9-4b57-bc4d-23c336adca0f@jetbrains.com> Thanks Chris! I've made the bug?https://youtrack.jetbrains.com/issue/IDEA-365900 visible, there's a reproducer there. Thanks, Egor On 02.04.2025 01:39, Chris Plummer wrote: > The short answer is yes. The debug agent needs to deal with > JVMTI_EVENT_VIRTUAL_THREAD_START/END events for every virtual thread. > What makes it worse is when there are a large number of virtual > threads that are currently alive. They are tracked on a list of > ThreadNodes that starts to slow down debug agent performance when it > gets too long. I have a work in progress that proactively purges these > ThreadNodes so the list does not get too big.?I've been meaning to > revive this project for quite some time. If you have a test case I'd > be willing to experiment with these changes some more. I could not > access to the IDEA-365900 link you provided. > > Note I think after the work is done to purge ThreadNodes proactively > it might not be that hard of step to move to not needing > JVMTI_EVENT_VIRTUAL_THREAD_START/END events enabled, which will help > performance a lot more. > > Chris > > On 4/1/25 10:14 AM, Egor Ushakov wrote: >> Hi everyone! >> >> Is it expected that with the debugger attached creating virtual >> threads is much slower? >> We're getting bugs like: >> https://youtrack.jetbrains.com/issue/IDEA-365900 >> And I can reproduce it easily with jdb... >> Just attaching the debugger immediately slows down virtual threads >> creation significantly. >> >> >java >> -agentlib:jdwp=transport=dt_shmem,server=y,suspend=n,address=8000 app >> ... >> 6808805 (1.2046688E7 threads per second) >> ... >> after >jdb -attach 8000 >> ... >> 30215 (95986.055 threads per second) >> ... >> >> Thanks, >> Egor From jkern at openjdk.org Wed Apr 2 11:05:01 2025 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 2 Apr 2025 11:05:01 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v4] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 10:30:56 GMT, Varada M wrote: >> AIX changes for attach API to support arbitrary length arguments and the streaming output support. >> serviceability/attach/AttachAPIv2/StreamingOutputTest.java test passes >> >> tier1, tier2 and tier3 testing is successful with fastdebug level >> >> JBS Issue : [JDK-8352392](https://bugs.openjdk.org/browse/JDK-8352392) > > Varada M has updated the pull request incrementally with one additional commit since the last revision: > > 8352392: AIX: implement attach API v2 and streaming output Hi Varada, I see that you've largely adapted the code to the POSIX version. Essentially, only the shutdown special handling remains. But the AIX special handling that you removed was introduced for some reason. Do you know why? Does this reasoning no longer apply? I have no idea and can't judge whether the different semantics have created holes in the code that could reappear under certain circumstances. Therefore, I would like an explanation from you as to why you can make this change now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24177#issuecomment-2772207203 From jsjolen at openjdk.org Wed Apr 2 12:45:59 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 2 Apr 2025 12:45:59 GMT Subject: RFR: 8353479: jcmd with streaming output breaks intendation In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 21:23:01 GMT, Alex Menkov wrote: > `outputStream` implementations should call `update_position` from `write` to correctly handle indentation. > The fix adds the call to `attachStream::write` > > testing: sanity tier1; > in progress: tier2..4,hs-tier5-svc Thanks! ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24368#pullrequestreview-2736139795 From duke at openjdk.org Wed Apr 2 14:02:25 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 2 Apr 2025 14:02:25 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v2] In-Reply-To: References: Message-ID: > ### Summary: > This PR makes memory operations atomic with NMT accounting. > > ### The problem: > In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. > > 1.1 Thread_1 releases range_A. > 1.2 Thread_1 tells NMT "range_A has been released". > > 2.1 Thread_2 reserves (the now free) range_A. > 2.2 Thread_2 tells NMT "range_A is reserved". > > Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. > > ### Solution: > Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. > > ### Other notes: > I also simplified this pattern found in many places: > > if (MemTracker::enabled()) { > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_some_operation(addr, bytes); > if (result != nullptr) { > MemTracker::record_some_operation(addr, bytes); > } > } else { > result = pd_unmap_memory(addr, bytes); > } > ``` > To: > > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_unmap_memory(addr, bytes); > MemTracker::record_some_operation(addr, bytes); > ``` > This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. > > I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section. However, I found that the OS-specific "pd_" functions are already short and to-the-point, so doing this wasn't reducing the lock scope very much. Instead it just makes the code more messy by having to maintain the locking and NMT accounting in each platform specific implementation. > > In many places I've done minor refactoring by relocating call... Robert Toyonaga has updated the pull request incrementally with two additional commits since the last revision: - tests and comments - Revert "make memory op and NMT accounting atomic" This reverts commit 86423d0b7e8e2b0b313a686a64c803028a5f2420. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24084/files - new: https://git.openjdk.org/jdk/pull/24084/files/86423d0b..74f31202 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=00-01 Stats: 246 lines in 12 files changed: 60 ins; 123 del; 63 mod Patch: https://git.openjdk.org/jdk/pull/24084.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24084/head:pull/24084 PR: https://git.openjdk.org/jdk/pull/24084 From duke at openjdk.org Wed Apr 2 14:06:09 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 2 Apr 2025 14:06:09 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v2] In-Reply-To: References: Message-ID: On Wed, 2 Apr 2025 14:02:25 GMT, Robert Toyonaga wrote: >> ### Summary: >> This PR makes memory operations atomic with NMT accounting. >> >> ### The problem: >> In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. >> >> 1.1 Thread_1 releases range_A. >> 1.2 Thread_1 tells NMT "range_A has been released". >> >> 2.1 Thread_2 reserves (the now free) range_A. >> 2.2 Thread_2 tells NMT "range_A is reserved". >> >> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. >> >> ### Solution: >> Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. >> >> ### Other notes: >> I also simplified this pattern found in many places: >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` >> This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. >> >> I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section. However, I found that the OS-specific "pd_" functions are already short and to-the-point, so doing this wasn't reducing the lock scope very much. Instead it just makes the code more messy by having to maintain the locking and NMT accounting in each platform specific i... > > Robert Toyonaga has updated the pull request incrementally with two additional commits since the last revision: > > - tests and comments > - Revert "make memory op and NMT accounting atomic" > > This reverts commit 86423d0b7e8e2b0b313a686a64c803028a5f2420. OK I have reverted the original changes, added comments, and kept the new tests that are still relevant. Please have another look when you have time. I'll go ahead and open RFE's for the topics you suggested above. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2772669368 From duke at openjdk.org Wed Apr 2 16:03:04 2025 From: duke at openjdk.org (Larry Cable) Date: Wed, 2 Apr 2025 16:03:04 GMT Subject: Integrated: 8344671: Few JFR streaming tests fail with application not alive error on MacOS 15 In-Reply-To: <3xUroXKNX4bBRb0L4r5WJ9V_TEJRbtS_hmdZ3AMCTFo=.86aaf7a8-d2c1-4f07-9f74-4e2cab2d0fa2@github.com> References: <3xUroXKNX4bBRb0L4r5WJ9V_TEJRbtS_hmdZ3AMCTFo=.86aaf7a8-d2c1-4f07-9f74-4e2cab2d0fa2@github.com> Message-ID: On Mon, 17 Mar 2025 18:26:57 GMT, Larry Cable wrote: > on both Linux and MacOS libattach utilizes UNIX signal (QUIT) to cause a target JVM (attachee) to create the socket file used as transport for subsequent jcmds (and other attach based interactions) and to listen upon that for such. > > it should be noted that the default behavior for QUIT (if not blocked or caught) is to terminate the signalled process. > > during the early lifetime of a JVM, its signal handlers are not yet installed, and thus any signal such as QUIT will cause the > default behavior to occur, in this case the JVM will be terminated. > > this is why some tests are failing with "not alive" > > the "fix" is similar in nature to that already implemented for linux (however using a different OS dependent mechanism to obtain the attachee JVM's signal masks: sysctl(2)). > > the method "checkCatchesAndSendQuitTo" will now obtain the "attachee" JVM signal masks and only kill(QUIT) if the > current masks indicate that the JVM's signals are now being handled. > > the behavior in the success case is now identical to the previous implementation, however should the target JVM not > become "ready" (signal handlers installed) prior to the attach "timeout" occurring the attach operation will throw an > "AttachNotSupportedException" with a suitable error message. > > see also: https://bugs.openjdk.org/browse/JDK-8350766 This pull request has now been integrated. Changeset: d979bd85 Author: Larry Cable Committer: Kevin Walls URL: https://git.openjdk.org/jdk/commit/d979bd859215a16e6398ae627acfd40e8d71102c Stats: 60 lines in 3 files changed: 44 ins; 3 del; 13 mod 8344671: Few JFR streaming tests fail with application not alive error on MacOS 15 Reviewed-by: dholmes, kevinw ------------- PR: https://git.openjdk.org/jdk/pull/24085 From cjplummer at openjdk.org Wed Apr 2 17:07:02 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 2 Apr 2025 17:07:02 GMT Subject: RFR: 8352088: Call of com.sun.jdi.ThreadReference.threadGroups() can lock up target VM [v7] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 20:00:22 GMT, Chris Plummer wrote: >> Calling ThreadGroupReference.groups() from an event handler can cause a deadlock. Details in first comment. Tested with :jdk_lang on all supported platforms and tier1, tier2, tier3, and tier5 svc testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > minor comment update Thanks reviews Sergei, Jai, and Alan ------------- PR Comment: https://git.openjdk.org/jdk/pull/24236#issuecomment-2773202187 From cjplummer at openjdk.org Wed Apr 2 17:07:03 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 2 Apr 2025 17:07:03 GMT Subject: Integrated: 8352088: Call of com.sun.jdi.ThreadReference.threadGroups() can lock up target VM In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 20:36:28 GMT, Chris Plummer wrote: > Calling ThreadGroupReference.groups() from an event handler can cause a deadlock. Details in first comment. Tested with :jdk_lang on all supported platforms and tier1, tier2, tier3, and tier5 svc testing. This pull request has now been integrated. Changeset: cc870d49 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/cc870d4960b3e121afc76df546228cda4b600632 Stats: 130 lines in 2 files changed: 128 ins; 0 del; 2 mod 8352088: Call of com.sun.jdi.ThreadReference.threadGroups() can lock up target VM Reviewed-by: alanb, jpai, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/24236 From duke at openjdk.org Thu Apr 3 02:56:54 2025 From: duke at openjdk.org (duke) Date: Thu, 3 Apr 2025 02:56:54 GMT Subject: Withdrawn: 8336017: Deprecate java.util.logging.LoggingMXBean, its implementation, and accessor method for removal In-Reply-To: References: Message-ID: On Thu, 23 Jan 2025 15:23:37 GMT, Kevin Walls wrote: > java.util.logging.LoggingMXBean and java.util.logging.LogManager::getLoggingMXBean are deprecated since JDK-8139982 in JDK 9. > > These deprecations should be uprated to state they are for future removal. > > java.util.logging.Logging (implements LoggingMXBean) should also be deprecated for removal. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/23271 From iklam at openjdk.org Thu Apr 3 04:09:28 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 3 Apr 2025 04:09:28 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output Message-ID: Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: const char* CDSConfig::input_static_archive_path(); const char* CDSConfig::input_dynamic_archive_path(); const char* CDSConfig::output_archive_path(); This PR also cleans up the code by: - renaming a few function to reflect what they actually do - moving more "config" management code into cdsConfig.cpp There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases ------------- Depends on: https://git.openjdk.org/jdk/pull/24272 Commit messages: - Minimized changes in ergo_init_classic_archive_paths() - Clean up CDS input/output path handling Changes: https://git.openjdk.org/jdk/pull/24401/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24401&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353597 Stats: 304 lines in 15 files changed: 156 ins; 55 del; 93 mod Patch: https://git.openjdk.org/jdk/pull/24401.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24401/head:pull/24401 PR: https://git.openjdk.org/jdk/pull/24401 From iklam at openjdk.org Thu Apr 3 04:09:29 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 3 Apr 2025 04:09:29 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 04:00:59 GMT, Ioi Lam wrote: > Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. > > In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: > > > const char* CDSConfig::input_static_archive_path(); > const char* CDSConfig::input_dynamic_archive_path(); > const char* CDSConfig::output_archive_path(); > > > This PR also cleans up the code by: > - renaming a few function to reflect what they actually do > - moving more "config" management code into cdsConfig.cpp > > There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. > > However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases src/hotspot/share/cds/dynamicArchive.cpp line 499: > 497: } > 498: } > 499: Moved to `CDSConfig::prepare_for_dumping()` src/hotspot/share/cds/metaspaceShared.cpp line 795: > 793: assert(CDSConfig::is_dumping_archive(), "sanity"); > 794: CDSConfig::check_unsupported_dumping_module_options(); > 795: } Moved to `CDSConfig::prepare_for_dumping()` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2026141716 PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2026142127 From varadam at openjdk.org Thu Apr 3 07:23:05 2025 From: varadam at openjdk.org (Varada M) Date: Thu, 3 Apr 2025 07:23:05 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v4] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 10:30:56 GMT, Varada M wrote: >> AIX changes for attach API to support arbitrary length arguments and the streaming output support. >> serviceability/attach/AttachAPIv2/StreamingOutputTest.java test passes >> >> tier1, tier2 and tier3 testing is successful with fastdebug level >> >> JBS Issue : [JDK-8352392](https://bugs.openjdk.org/browse/JDK-8352392) > > Varada M has updated the pull request incrementally with one additional commit since the last revision: > > 8352392: AIX: implement attach API v2 and streaming output With the Attach API protocol update to version 2, now support arbitrary-length arguments, and shared code has been added in attachListener.cpp ([read_request](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/services/attachListener.cpp#L845), [write_fully](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/services/attachListener.cpp#L908)) to handle this across platforms. As a result, the AIX code is no longer required. I also compared the older changes in the POSIX implementation and found them to be the same as AIX ------------- PR Comment: https://git.openjdk.org/jdk/pull/24177#issuecomment-2774714653 From serguei.spitsyn at oracle.com Thu Apr 3 09:15:15 2025 From: serguei.spitsyn at oracle.com (Serguei Spitsyn) Date: Thu, 3 Apr 2025 09:15:15 +0000 Subject: Debugger overhead for virtual threads creation In-Reply-To: <5451248a-ded9-4b57-bc4d-23c336adca0f@jetbrains.com> References: <5451248a-ded9-4b57-bc4d-23c336adca0f@jetbrains.com> Message-ID: Hi Egor, Thank you for reporting this scalability issue when debugger is enabled. It looks like a JVMTI problem and we have some guesses but need some investigation to identify it better. Thanks, Serguei From: serviceability-dev on behalf of Egor Ushakov Date: Wednesday, April 2, 2025 at 3:32?AM To: Chris Plummer , serviceability-dev Subject: Re: Debugger overhead for virtual threads creation Thanks Chris! I've made the bug https://youtrack.jetbrains.com/issue/IDEA-365900 visible, there's a reproducer there. Thanks, Egor On 02.04.2025 01:39, Chris Plummer wrote: > The short answer is yes. The debug agent needs to deal with > JVMTI_EVENT_VIRTUAL_THREAD_START/END events for every virtual thread. > What makes it worse is when there are a large number of virtual > threads that are currently alive. They are tracked on a list of > ThreadNodes that starts to slow down debug agent performance when it > gets too long. I have a work in progress that proactively purges these > ThreadNodes so the list does not get too big. I've been meaning to > revive this project for quite some time. If you have a test case I'd > be willing to experiment with these changes some more. I could not > access to the IDEA-365900 link you provided. > > Note I think after the work is done to purge ThreadNodes proactively > it might not be that hard of step to move to not needing > JVMTI_EVENT_VIRTUAL_THREAD_START/END events enabled, which will help > performance a lot more. > > Chris > > On 4/1/25 10:14 AM, Egor Ushakov wrote: >> Hi everyone! >> >> Is it expected that with the debugger attached creating virtual >> threads is much slower? >> We're getting bugs like: >> https://youtrack.jetbrains.com/issue/IDEA-365900 >> And I can reproduce it easily with jdb... >> Just attaching the debugger immediately slows down virtual threads >> creation significantly. >> >> >java >> -agentlib:jdwp=transport=dt_shmem,server=y,suspend=n,address=8000 app >> ... >> 6808805 (1.2046688E7 threads per second) >> ... >> after >jdb -attach 8000 >> ... >> 30215 (95986.055 threads per second) >> ... >> >> Thanks, >> Egor -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkern at openjdk.org Thu Apr 3 10:23:55 2025 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 3 Apr 2025 10:23:55 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v2] In-Reply-To: References: Message-ID: On Wed, 2 Apr 2025 14:02:25 GMT, Robert Toyonaga wrote: >> ### Summary: >> This PR makes memory operations atomic with NMT accounting. >> >> ### The problem: >> In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. >> >> 1.1 Thread_1 releases range_A. >> 1.2 Thread_1 tells NMT "range_A has been released". >> >> 2.1 Thread_2 reserves (the now free) range_A. >> 2.2 Thread_2 tells NMT "range_A is reserved". >> >> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. >> >> ### Solution: >> Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. >> >> ### Other notes: >> I also simplified this pattern found in many places: >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` >> This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. >> >> I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section. However, I found that the OS-specific "pd_" functions are already short and to-the-point, so doing this wasn't reducing the lock scope very much. Instead it just makes the code more messy by having to maintain the locking and NMT accounting in each platform specific i... > > Robert Toyonaga has updated the pull request incrementally with two additional commits since the last revision: > > - tests and comments > - Revert "make memory op and NMT accounting atomic" > > This reverts commit 86423d0b7e8e2b0b313a686a64c803028a5f2420. Tonight we tested this PR on AIX and it failed in the gtest with Internal Error (os_aix.cpp:1917), pid=26476938, tid=258 Error: guarantee((vmi)) failed This will happen if a `os::pd_commit_memory()` or `os::pd_release_memory()` or `os::pd_uncommit_memory()` is called on memory not allocated with `os::pd_reserve_memory()` or `os::pd_attempt_map_memory_to_file_at()` or `os::pd_attempt_reserve_memory_at()` ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2775248289 From jkern at openjdk.org Thu Apr 3 12:50:57 2025 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 3 Apr 2025 12:50:57 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v4] In-Reply-To: References: Message-ID: <-x81ydaT2ImEfbGOLisk3XiQDsbln91IUf3jDNA1NDk=.d035465a-cb77-4c88-99aa-f7fc171411d0@github.com> On Tue, 1 Apr 2025 10:30:56 GMT, Varada M wrote: >> AIX changes for attach API to support arbitrary length arguments and the streaming output support. >> serviceability/attach/AttachAPIv2/StreamingOutputTest.java test passes >> >> tier1, tier2 and tier3 testing is successful with fastdebug level >> >> JBS Issue : [JDK-8352392](https://bugs.openjdk.org/browse/JDK-8352392) > > Varada M has updated the pull request incrementally with one additional commit since the last revision: > > 8352392: AIX: implement attach API v2 and streaming output Reasonable change. ------------- Marked as reviewed by jkern (Committer). PR Review: https://git.openjdk.org/jdk/pull/24177#pullrequestreview-2739729179 From mdoerr at openjdk.org Thu Apr 3 14:48:01 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 3 Apr 2025 14:48:01 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v4] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 10:30:56 GMT, Varada M wrote: >> AIX changes for attach API to support arbitrary length arguments and the streaming output support. >> serviceability/attach/AttachAPIv2/StreamingOutputTest.java test passes >> >> tier1, tier2 and tier3 testing is successful with fastdebug level >> >> JBS Issue : [JDK-8352392](https://bugs.openjdk.org/browse/JDK-8352392) > > Varada M has updated the pull request incrementally with one additional commit since the last revision: > > 8352392: AIX: implement attach API v2 and streaming output Marked as reviewed by mdoerr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24177#pullrequestreview-2740165191 From duke at openjdk.org Thu Apr 3 15:27:15 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Thu, 3 Apr 2025 15:27:15 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v3] In-Reply-To: References: Message-ID: > ### Summary: > This PR makes memory operations atomic with NMT accounting. > > ### The problem: > In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. > > 1.1 Thread_1 releases range_A. > 1.2 Thread_1 tells NMT "range_A has been released". > > 2.1 Thread_2 reserves (the now free) range_A. > 2.2 Thread_2 tells NMT "range_A is reserved". > > Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. > > ### Solution: > Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. > > ### Other notes: > I also simplified this pattern found in many places: > > if (MemTracker::enabled()) { > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_some_operation(addr, bytes); > if (result != nullptr) { > MemTracker::record_some_operation(addr, bytes); > } > } else { > result = pd_unmap_memory(addr, bytes); > } > ``` > To: > > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_unmap_memory(addr, bytes); > MemTracker::record_some_operation(addr, bytes); > ``` > This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. > > I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section. However, I found that the OS-specific "pd_" functions are already short and to-the-point, so doing this wasn't reducing the lock scope very much. Instead it just makes the code more messy by having to maintain the locking and NMT accounting in each platform specific implementation. > > In many places I've done minor refactoring by relocating call... Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: exclude file mapping tests on AIX. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24084/files - new: https://git.openjdk.org/jdk/pull/24084/files/74f31202..5c23a76a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24084.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24084/head:pull/24084 PR: https://git.openjdk.org/jdk/pull/24084 From duke at openjdk.org Thu Apr 3 15:27:16 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Thu, 3 Apr 2025 15:27:16 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 10:21:29 GMT, Joachim Kern wrote: > Internal Error (os_aix.cpp:1917), pid=26476938, tid=258 Error: guarantee((vmi)) failed > > This will happen if a `os::pd_commit_memory()` or `os::pd_release_memory()` or `os::pd_uncommit_memory()` is called on memory not allocated with `os::pd_reserve_memory()` or `os::pd_attempt_map_memory_to_file_at()` or `os::pd_attempt_reserve_memory_at()` Thank you for running the tests on AIX. I've excluded the file mapping tests that don't meet that criteria on AIX. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2776162180 From iklam at openjdk.org Thu Apr 3 15:46:31 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 3 Apr 2025 15:46:31 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v2] In-Reply-To: References: Message-ID: > Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. > > In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: > > > const char* CDSConfig::input_static_archive_path(); > const char* CDSConfig::input_dynamic_archive_path(); > const char* CDSConfig::output_archive_path(); > > > This PR also cleans up the code by: > - renaming a few function to reflect what they actually do > - moving more "config" management code into cdsConfig.cpp > > There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. > > However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: more clean up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24401/files - new: https://git.openjdk.org/jdk/pull/24401/files/9e17fedb..3ad42a3e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24401&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24401&range=00-01 Stats: 8 lines in 1 file changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/24401.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24401/head:pull/24401 PR: https://git.openjdk.org/jdk/pull/24401 From lmesnik at openjdk.org Thu Apr 3 19:36:55 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 3 Apr 2025 19:36:55 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 15:46:31 GMT, Ioi Lam wrote: >> Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. >> >> In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: >> >> >> const char* CDSConfig::input_static_archive_path(); >> const char* CDSConfig::input_dynamic_archive_path(); >> const char* CDSConfig::output_archive_path(); >> >> >> This PR also cleans up the code by: >> - renaming a few function to reflect what they actually do >> - moving more "config" management code into cdsConfig.cpp >> >> There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. >> >> However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > more clean up Changes requested by lmesnik (Reviewer). test/hotspot/jtreg/runtime/cds/appcds/AOTFlags.java line 28: > 26: * @test > 27: * @summary "AOT" aliases for traditional CDS command-line options > 28: * @requires vm.cds & vm.compMode != "Xcomp" The test completely ignore external VM flags, so it should have `@requires vm.flagless ` and no Xcomp exclusion is required. ------------- PR Review: https://git.openjdk.org/jdk/pull/24401#pullrequestreview-2740946504 PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2027621201 From iklam at openjdk.org Thu Apr 3 20:31:50 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 3 Apr 2025 20:31:50 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v3] In-Reply-To: References: Message-ID: > Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. > > In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: > > > const char* CDSConfig::input_static_archive_path(); > const char* CDSConfig::input_dynamic_archive_path(); > const char* CDSConfig::output_archive_path(); > > > This PR also cleans up the code by: > - renaming a few function to reflect what they actually do > - moving more "config" management code into cdsConfig.cpp > > There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. > > However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @lmesnik comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24401/files - new: https://git.openjdk.org/jdk/pull/24401/files/3ad42a3e..90b4b688 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24401&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24401&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24401.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24401/head:pull/24401 PR: https://git.openjdk.org/jdk/pull/24401 From chris.plummer at oracle.com Thu Apr 3 20:32:50 2025 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 3 Apr 2025 13:32:50 -0700 Subject: Debugger overhead for virtual threads creation In-Reply-To: References: <5451248a-ded9-4b57-bc4d-23c336adca0f@jetbrains.com> Message-ID: <3ade7875-b4c8-4bf0-82cf-71452caf8df1@oracle.com> I got my prototype ThreadNode cleanup code working again. It seems to be doing the job of keeping the list down to a minimal size, with usually at most a few ThreadNodes on the list. However, that didn't help performance. The reason is because the debug agent normally does not need to traverse the list. It maps jthread to ThreadNode by using JVMTI thread local data, not by searching the list, and the list is doubly linked, so ThreadNodes can be removed quickly. After the above results I talked with Serguei and he confirmed that JVMTI also has a linked list, and there are some occasions where it needs to be traversed, so that likely the bottleneck here. Regarding my suggestion that we may be able to disable JVMTI_EVENT_VIRTUAL_THREAD_START/END, I did experiment with that and when disabled there are no performance issues, so I will look into getting this to work properly. The events need to be enable if there are any ThreadStartRequests or ThreadDeathRequests that do not have the PlatformThreadOnly filter enabled. https://docs.oracle.com/en/java/javase/24/docs/api/jdk.jdi/com/sun/jdi/request/ThreadStartRequest.html#addPlatformThreadsOnlyFilter() Jdb by default uses this filter, and I suspect that IDEA does also. If not, it would get a flood of ThreadStart and ThreadDeath events for all the virtual threads. Chris On 4/3/25 2:15 AM, Serguei Spitsyn wrote: > > Hi Egor, > > Thank you for reporting this scalability issue when debugger is enabled. > It looks like a JVMTI problem and we have some guesses but need some > investigation to identify it better. > > Thanks, > > Serguei > > *From: *serviceability-dev on > behalf of Egor Ushakov > *Date: *Wednesday, April 2, 2025 at 3:32?AM > *To: *Chris Plummer , serviceability-dev > > *Subject: *Re: Debugger overhead for virtual threads creation > > Thanks Chris! > > I've made the bug https://youtrack.jetbrains.com/issue/IDEA-365900 > visible, there's a reproducer there. > > Thanks, > Egor > > On 02.04.2025 01:39, Chris Plummer wrote: > > The short answer is yes. The debug agent needs to deal with > > JVMTI_EVENT_VIRTUAL_THREAD_START/END events for every virtual thread. > > What makes it worse is when there are a large number of virtual > > threads that are currently alive. They are tracked on a list of > > ThreadNodes that starts to slow down debug agent performance when it > > gets too long. I have a work in progress that proactively purges these > > ThreadNodes so the list does not get too big.?I've been meaning to > > revive this project for quite some time. If you have a test case I'd > > be willing to experiment with these changes some more. I could not > > access to the IDEA-365900 link you provided. > > > > Note I think after the work is done to purge ThreadNodes proactively > > it might not be that hard of step to move to not needing > > JVMTI_EVENT_VIRTUAL_THREAD_START/END events enabled, which will help > > performance a lot more. > > > > Chris > > > > On 4/1/25 10:14 AM, Egor Ushakov wrote: > >> Hi everyone! > >> > >> Is it expected that with the debugger attached creating virtual > >> threads is much slower? > >> We're getting bugs like: > >> https://youtrack.jetbrains.com/issue/IDEA-365900 > >> And I can reproduce it easily with jdb... > >> Just attaching the debugger immediately slows down virtual threads > >> creation significantly. > >> > >> >java > >> -agentlib:jdwp=transport=dt_shmem,server=y,suspend=n,address=8000 app > >> ... > >> 6808805 (1.2046688E7 threads per second) > >> ... > >> after >jdb -attach 8000 > >> ... > >> 30215 (95986.055 threads per second) > >> ... > >> > >> Thanks, > >> Egor > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iklam at openjdk.org Thu Apr 3 21:10:51 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 3 Apr 2025 21:10:51 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 19:33:12 GMT, Leonid Mesnik wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> more clean up > > test/hotspot/jtreg/runtime/cds/appcds/AOTFlags.java line 28: > >> 26: * @test >> 27: * @summary "AOT" aliases for traditional CDS command-line options >> 28: * @requires vm.cds & vm.compMode != "Xcomp" > > The test completely ignore external VM flags, so it should have > `@requires vm.flagless ` > and no Xcomp exclusion is required. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2027743880 From iklam at openjdk.org Thu Apr 3 21:43:56 2025 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 3 Apr 2025 21:43:56 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 15:56:02 GMT, Vladimir Kozlov wrote: > @iklam one annoying thing in current ergonomic setting for AOTCode flags in mainline is checking which phase we are executing. We agreed before that we should only save/load AOT code when `AOTClassLinking` is on because AOT code needs classes to be preloaded. > > I have to do next checks to enable AOTCode in `CDSConfig::check_vm_args_consistency()`: > > ``` > if (AOTClassLinking && is_using_archive() && !is_dumping_archive() && !FLAG_IS_DEFAULT(AOTCache)) { > FLAG_SET_ERGO_IF_DEFAULT(LoadAOTCode, true); > ... > if (AOTClassLinking && is_dumping_final_static_archive()) { > FLAG_SET_ERGO_IF_DEFAULT(StoreAOTCode, true); > ``` > > First, I am not sure these conditions are correct. > > Second, it would be nice to have simple checks instead: `is_dumping_aot_archive()` and `is_using_aot_archive()`. > > May be also consider it is error if both conditions are true (we don't support updating archive yet). There are a lot of dependencies between different AOT capabilities, and it's hard to control that using global variables. At the point of `CDSConfig::check_vm_args_consistency()`, we don't have complete knowledge whether the AOT cache exists, or whether the cache contains AOT code, or whether the GC compressed oops settings are compatible with the AOT code. In the handling of such "AOT capability flags", I have been using the following pattern: In `CDSConfig::check_vm_args_consistency()` we update the default values of the flags according to their dependencies on other flags. E.g., by specifying `-XX:AOTMode=create`, `AOTClassLinking` and `AOTInvokeDynamicLinking` are enabled by default. if (!FLAG_IS_DEFAULT(AOTMode)) { // Using any form of the new AOTMode switch enables enhanced optimizations. FLAG_SET_ERGO_IF_DEFAULT(AOTClassLinking, true); } if (AOTClassLinking) { // If AOTClassLinking is specified, enable all AOT optimizations by default. FLAG_SET_ERGO_IF_DEFAULT(AOTInvokeDynamicLinking, true); } else { // AOTInvokeDynamicLinking depends on AOTClassLinking. FLAG_SET_ERGO(AOTInvokeDynamicLinking, false); } However, the values of these flags are just advisory. Even if a flag is enabled, the underlying capability may be disabled. For example, `AOTClassLinking` requires the ability of dumping heap objects, which is not available if ZGC is used. Because the dependencies are complex, it's difficult to resolve them statically and set a global boolean variable for each capability. Instead, I have been expressing the dependencies programmatically using accessor functions: bool CDSConfig::is_dumping_aot_linked_classes() { if (is_dumping_preimage_static_archive()) { return false; } else if (is_dumping_dynamic_archive()) { return is_using_full_module_graph() && AOTClassLinking; } else if (is_dumping_static_archive()) { return is_dumping_full_module_graph() && AOTClassLinking; } else { return false; } } bool CDSConfig::is_dumping_invokedynamic() { // Requires is_dumping_aot_linked_classes(). Otherwise the classes of some archived heap // objects used by the archive indy callsites may be replaced at runtime. return AOTInvokeDynamicLinking && is_dumping_aot_linked_classes() && is_dumping_heap(); } I would suggest doing something like this for storing AOT code: bool CDSConfig::is_dumping_aot_code() { return StoreAOTCode && is_dumping_final_static_archive() && is_dumping_aot_linked_classes(); } For loading AOT code, it's simpler. We can do a definite check immediately after the AOT cache has been mapped. This also makes the run-time check efficient (whereas the assembly-time checks can take their time). if (LoadAOTCode && cache has AOT code && vm options are compatible) { CDSConfig::_is_using_aot_code = true; } else { CDSConfig::_is_using_aot_code = false; } inline bool CDSConfig::is_using_aot_code() { return CDSConfig::_is_using_aot_code; } ------------- PR Comment: https://git.openjdk.org/jdk/pull/24401#issuecomment-2776976568 From kvn at openjdk.org Thu Apr 3 22:04:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 3 Apr 2025 22:04:54 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 21:40:48 GMT, Ioi Lam wrote: >> @iklam one annoying thing in current ergonomic setting for AOTCode flags in mainline is checking which phase we are executing. We agreed before that we should only save/load AOT code when `AOTClassLinking` is on because AOT code needs classes to be preloaded. >> >> I have to do next checks to enable AOTCode in `CDSConfig::check_vm_args_consistency()`: >> >> if (AOTClassLinking && is_using_archive() && !is_dumping_archive() && !FLAG_IS_DEFAULT(AOTCache)) { >> FLAG_SET_ERGO_IF_DEFAULT(LoadAOTCode, true); >> ... >> if (AOTClassLinking && is_dumping_final_static_archive()) { >> FLAG_SET_ERGO_IF_DEFAULT(StoreAOTCode, true); >> >> >> First, I am not sure these conditions are correct. >> >> Second, it would be nice to have simple checks instead: `is_dumping_aot_archive()` and `is_using_aot_archive()`. >> >> May be also consider it is error if both conditions are true (we don't support updating archive yet). > >> @iklam one annoying thing in current ergonomic setting for AOTCode flags in mainline is checking which phase we are executing. We agreed before that we should only save/load AOT code when `AOTClassLinking` is on because AOT code needs classes to be preloaded. >> >> I have to do next checks to enable AOTCode in `CDSConfig::check_vm_args_consistency()`: >> >> ``` >> if (AOTClassLinking && is_using_archive() && !is_dumping_archive() && !FLAG_IS_DEFAULT(AOTCache)) { >> FLAG_SET_ERGO_IF_DEFAULT(LoadAOTCode, true); >> ... >> if (AOTClassLinking && is_dumping_final_static_archive()) { >> FLAG_SET_ERGO_IF_DEFAULT(StoreAOTCode, true); >> ``` >> >> First, I am not sure these conditions are correct. >> >> Second, it would be nice to have simple checks instead: `is_dumping_aot_archive()` and `is_using_aot_archive()`. >> >> May be also consider it is error if both conditions are true (we don't support updating archive yet). > > There are a lot of dependencies between different AOT capabilities, and it's hard to control that using global variables. At the point of `CDSConfig::check_vm_args_consistency()`, we don't have complete knowledge whether the AOT cache exists, or whether the cache contains AOT code, or whether the GC compressed oops settings are compatible with the AOT code. > > In the handling of such "AOT capability flags", I have been using the following pattern: > > In `CDSConfig::check_vm_args_consistency()` we update the default values of the flags according to their dependencies on other flags. E.g., by specifying `-XX:AOTMode=create`, `AOTClassLinking` and `AOTInvokeDynamicLinking` are enabled by default. > > > if (!FLAG_IS_DEFAULT(AOTMode)) { > // Using any form of the new AOTMode switch enables enhanced optimizations. > FLAG_SET_ERGO_IF_DEFAULT(AOTClassLinking, true); > } > > if (AOTClassLinking) { > // If AOTClassLinking is specified, enable all AOT optimizations by default. > FLAG_SET_ERGO_IF_DEFAULT(AOTInvokeDynamicLinking, true); > } else { > // AOTInvokeDynamicLinking depends on AOTClassLinking. > FLAG_SET_ERGO(AOTInvokeDynamicLinking, false); > } > > > However, the values of these flags are just advisory. Even if a flag is enabled, the underlying capability may be disabled. For example, `AOTClassLinking` requires the ability of dumping heap objects, which is not available if ZGC is used. > > Because the dependencies are complex, it's difficult to resolve them statically and set a global boolean variable for each capability. Instead, I have been expres... Thank you @iklam for explanation. I can do final adjustment to `Store|LoadAOTCode` flags values in `StoreAOTCode::initialize()` which is called from `initialize_shared_spaces()`: MetaspaceShared::initialize_shared_spaces() { ... static_mapinfo->patch_heap_embedded_pointers(); ArchiveHeapLoader::finish_initialization(); Universe::load_archived_object_instances(); + AOTCodeCache::initialize(); The question: at this place are all CDS AOT flags are final (flags compatibility and cache presence are verified)? Note, `Store|LoadAOTCode` flags are diagnostic and disabled by default. I need to set them to `true` somewhere. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24401#issuecomment-2777090009 From sspitsyn at openjdk.org Fri Apr 4 06:23:52 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 4 Apr 2025 06:23:52 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 17:58:30 GMT, Leonid Mesnik wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> some cleanup > > src/hotspot/share/prims/jvmtiEnv.cpp line 1078: > >> 1076: JvmtiEnv::ResumeThread(jthread thread) { >> 1077: // resume thread with handshake >> 1078: ResumeThreadClosure op(/* single_resume */ true); > > Could you please explain how thread is protected from racing with mounting<->unmounting operations with resume_thread operations? > It might be unlikely happens for suspended threads, but for alive threads the results are not defined. Thank you for the question. The `JvmtiHanshake::execute()` has a `JvmtiVTMSTransitionDisabler` installed: JvmtiHandshake::execute(JvmtiUnitedHandshakeClosure* hs_cl, jthread target) { JavaThread* current = JavaThread::current(); HandleMark hm(current); JvmtiVTMSTransitionDisabler disabler(target); <= !!!!!!! . . . > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1759: > >> 1757: Handle thread_h(current, thread_oop); >> 1758: bool is_virtual = java_lang_VirtualThread::is_instance(thread_h()); >> 1759: bool is_thread_carrying = is_thread_carrying_vthread(java_thread, thread_h()); > > I think that somewhere in this place should be an explanation of suspend<->resume synchronization. As I understand the hadshake can't be executed and clear suspend state while suspend_thread is done for the same thread. How it is guaranteed that suspend_thread flag cann't be updated? > It is not obvious and also put some restrictions on the suspend_thread implementation to keep this behaviour. Thank you for reviewing and this suggestion. Yes, you are right. I'll try to find a good place to add such a comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2028158825 PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2028161088 From varadam at openjdk.org Fri Apr 4 06:43:55 2025 From: varadam at openjdk.org (Varada M) Date: Fri, 4 Apr 2025 06:43:55 GMT Subject: RFR: 8352392: AIX: implement attach API v2 and streaming output [v4] In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 10:30:56 GMT, Varada M wrote: >> AIX changes for attach API to support arbitrary length arguments and the streaming output support. >> serviceability/attach/AttachAPIv2/StreamingOutputTest.java test passes >> >> tier1, tier2 and tier3 testing is successful with fastdebug level >> >> JBS Issue : [JDK-8352392](https://bugs.openjdk.org/browse/JDK-8352392) > > Varada M has updated the pull request incrementally with one additional commit since the last revision: > > 8352392: AIX: implement attach API v2 and streaming output Thanks all, ------------- PR Comment: https://git.openjdk.org/jdk/pull/24177#issuecomment-2777687525 From varadam at openjdk.org Fri Apr 4 06:43:56 2025 From: varadam at openjdk.org (Varada M) Date: Fri, 4 Apr 2025 06:43:56 GMT Subject: Integrated: 8352392: AIX: implement attach API v2 and streaming output In-Reply-To: References: Message-ID: On Sun, 23 Mar 2025 14:33:36 GMT, Varada M wrote: > AIX changes for attach API to support arbitrary length arguments and the streaming output support. > serviceability/attach/AttachAPIv2/StreamingOutputTest.java test passes > > tier1, tier2 and tier3 testing is successful with fastdebug level > > JBS Issue : [JDK-8352392](https://bugs.openjdk.org/browse/JDK-8352392) This pull request has now been integrated. Changeset: 41d4a0d7 Author: Varada M URL: https://git.openjdk.org/jdk/commit/41d4a0d7bdda2a96af1e7f549c05d99d68c040dc Stats: 283 lines in 3 files changed: 64 ins; 203 del; 16 mod 8352392: AIX: implement attach API v2 and streaming output Reviewed-by: mdoerr, jkern, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24177 From serb at openjdk.org Fri Apr 4 09:29:56 2025 From: serb at openjdk.org (Sergey Bylokhov) Date: Fri, 4 Apr 2025 09:29:56 GMT Subject: RFR: 8344671: Few JFR streaming tests fail with application not alive error on MacOS 15 [v5] In-Reply-To: References: <3xUroXKNX4bBRb0L4r5WJ9V_TEJRbtS_hmdZ3AMCTFo=.86aaf7a8-d2c1-4f07-9f74-4e2cab2d0fa2@github.com> Message-ID: On Mon, 31 Mar 2025 18:02:12 GMT, Larry Cable wrote: >> on both Linux and MacOS libattach utilizes UNIX signal (QUIT) to cause a target JVM (attachee) to create the socket file used as transport for subsequent jcmds (and other attach based interactions) and to listen upon that for such. >> >> it should be noted that the default behavior for QUIT (if not blocked or caught) is to terminate the signalled process. >> >> during the early lifetime of a JVM, its signal handlers are not yet installed, and thus any signal such as QUIT will cause the >> default behavior to occur, in this case the JVM will be terminated. >> >> this is why some tests are failing with "not alive" >> >> the "fix" is similar in nature to that already implemented for linux (however using a different OS dependent mechanism to obtain the attachee JVM's signal masks: sysctl(2)). >> >> the method "checkCatchesAndSendQuitTo" will now obtain the "attachee" JVM signal masks and only kill(QUIT) if the >> current masks indicate that the JVM's signals are now being handled. >> >> the behavior in the success case is now identical to the previous implementation, however should the target JVM not >> become "ready" (signal handlers installed) prior to the attach "timeout" occurring the attach operation will throw an >> "AttachNotSupportedException" with a suitable error message. >> >> see also: https://bugs.openjdk.org/browse/JDK-8350766 > > Larry Cable has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch 'JDK-8344671' of github.com:larry-cable/jdk into JDK-8344671 > - JDK-8334671: minor changes requested by @dholmes src/jdk.attach/macosx/native/libattach/VirtualMachineImpl.c line 118: > 116: * Signature: (I)V > 117: */ > 118: JNIEXPORT jboolean JNICALL Java_sun_tools_attach_VirtualMachineImpl_checkCatchesAndSendQuitTo I?m just curious - why does this method return a boolean? It looks like the result is never actually checked, and the control flow is managed through exceptions or errors(on both linux and macos). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24085#discussion_r2028449674 From iklam at openjdk.org Sun Apr 6 21:30:49 2025 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 6 Apr 2025 21:30:49 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 21:40:48 GMT, Ioi Lam wrote: >> @iklam one annoying thing in current ergonomic setting for AOTCode flags in mainline is checking which phase we are executing. We agreed before that we should only save/load AOT code when `AOTClassLinking` is on because AOT code needs classes to be preloaded. >> >> I have to do next checks to enable AOTCode in `CDSConfig::check_vm_args_consistency()`: >> >> if (AOTClassLinking && is_using_archive() && !is_dumping_archive() && !FLAG_IS_DEFAULT(AOTCache)) { >> FLAG_SET_ERGO_IF_DEFAULT(LoadAOTCode, true); >> ... >> if (AOTClassLinking && is_dumping_final_static_archive()) { >> FLAG_SET_ERGO_IF_DEFAULT(StoreAOTCode, true); >> >> >> First, I am not sure these conditions are correct. >> >> Second, it would be nice to have simple checks instead: `is_dumping_aot_archive()` and `is_using_aot_archive()`. >> >> May be also consider it is error if both conditions are true (we don't support updating archive yet). > >> @iklam one annoying thing in current ergonomic setting for AOTCode flags in mainline is checking which phase we are executing. We agreed before that we should only save/load AOT code when `AOTClassLinking` is on because AOT code needs classes to be preloaded. >> >> I have to do next checks to enable AOTCode in `CDSConfig::check_vm_args_consistency()`: >> >> ``` >> if (AOTClassLinking && is_using_archive() && !is_dumping_archive() && !FLAG_IS_DEFAULT(AOTCache)) { >> FLAG_SET_ERGO_IF_DEFAULT(LoadAOTCode, true); >> ... >> if (AOTClassLinking && is_dumping_final_static_archive()) { >> FLAG_SET_ERGO_IF_DEFAULT(StoreAOTCode, true); >> ``` >> >> First, I am not sure these conditions are correct. >> >> Second, it would be nice to have simple checks instead: `is_dumping_aot_archive()` and `is_using_aot_archive()`. >> >> May be also consider it is error if both conditions are true (we don't support updating archive yet). > > There are a lot of dependencies between different AOT capabilities, and it's hard to control that using global variables. At the point of `CDSConfig::check_vm_args_consistency()`, we don't have complete knowledge whether the AOT cache exists, or whether the cache contains AOT code, or whether the GC compressed oops settings are compatible with the AOT code. > > In the handling of such "AOT capability flags", I have been using the following pattern: > > In `CDSConfig::check_vm_args_consistency()` we update the default values of the flags according to their dependencies on other flags. E.g., by specifying `-XX:AOTMode=create`, `AOTClassLinking` and `AOTInvokeDynamicLinking` are enabled by default. > > > if (!FLAG_IS_DEFAULT(AOTMode)) { > // Using any form of the new AOTMode switch enables enhanced optimizations. > FLAG_SET_ERGO_IF_DEFAULT(AOTClassLinking, true); > } > > if (AOTClassLinking) { > // If AOTClassLinking is specified, enable all AOT optimizations by default. > FLAG_SET_ERGO_IF_DEFAULT(AOTInvokeDynamicLinking, true); > } else { > // AOTInvokeDynamicLinking depends on AOTClassLinking. > FLAG_SET_ERGO(AOTInvokeDynamicLinking, false); > } > > > However, the values of these flags are just advisory. Even if a flag is enabled, the underlying capability may be disabled. For example, `AOTClassLinking` requires the ability of dumping heap objects, which is not available if ZGC is used. > > Because the dependencies are complex, it's difficult to resolve them statically and set a global boolean variable for each capability. Instead, I have been expres... > Thank you @iklam for explanation. I can do final adjustment to `Store|LoadAOTCode` flags values in `StoreAOTCode::initialize()` which is called from `initialize_shared_spaces()`: > > ``` > MetaspaceShared::initialize_shared_spaces() { > ... > static_mapinfo->patch_heap_embedded_pointers(); > ArchiveHeapLoader::finish_initialization(); > Universe::load_archived_object_instances(); > + AOTCodeCache::initialize(); > ``` > > The question: at this place are all CDS AOT flags are final (flags compatibility and cache presence are verified)? > > Note, `Store|LoadAOTCode` flags are diagnostic and disabled by default. I need to set them to `true` somewhere. Yes, at this point all configuration related to AOT should be final. You can set the final values for the `Store|LoadAOTCode` flags here. `StoreAOTCode` should be true only if `CDSConfig::is_dumping_final_static_archive()` is true. `LoadAOTCode` should be true only if `CDSConfig::is_loading_archive()` is true and the archive contains AOT code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24401#issuecomment-2781681413 From dholmes at openjdk.org Mon Apr 7 07:39:54 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 7 Apr 2025 07:39:54 GMT Subject: RFR: 8353231: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 10:27:39 GMT, Kevin Walls wrote: > Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently. > On failure, 10 attempts with sleep(200) each time, only read -1 from mbean.getProcessCpuLoad(). > The method is documented to return -1 when info is not available, but want to avoid the test accepting a -1 and masking real problems. > > Test failures are happening when multiple CPU load reding tests ran on the same host, at the same second. > Add a TEST.properties file containing: exclusiveAccess.dirs=. Okay lets give this a try. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24352#pullrequestreview-2745797954 From kevinw at openjdk.org Mon Apr 7 09:28:38 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 7 Apr 2025 09:28:38 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p Message-ID: This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. It has always done a manual "root plus pid plus extension" on the default filename only, and should move to using Argument::copy_expand_pid() like we do with other such filenames. We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). ------------- Commit messages: - 8353727: HeapDumpPath doesn't expand %p Changes: https://git.openjdk.org/jdk/pull/24482/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24482&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353727 Stats: 73 lines in 2 files changed: 26 ins; 23 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/24482.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24482/head:pull/24482 PR: https://git.openjdk.org/jdk/pull/24482 From jkern at openjdk.org Mon Apr 7 09:58:50 2025 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 7 Apr 2025 09:58:50 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v3] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 15:27:15 GMT, Robert Toyonaga wrote: >> ### Summary: >> This PR makes memory operations atomic with NMT accounting. >> >> ### The problem: >> In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. >> >> 1.1 Thread_1 releases range_A. >> 1.2 Thread_1 tells NMT "range_A has been released". >> >> 2.1 Thread_2 reserves (the now free) range_A. >> 2.2 Thread_2 tells NMT "range_A is reserved". >> >> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. >> >> ### Solution: >> Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. >> >> ### Other notes: >> I also simplified this pattern found in many places: >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` >> This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. >> >> I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section. However, I found that the OS-specific "pd_" functions are already short and to-the-point, so doing this wasn't reducing the lock scope very much. Instead it just makes the code more messy by having to maintain the locking and NMT accounting in each platform specific i... > > Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: > > exclude file mapping tests on AIX. I ran the tests over the weekend again and now they passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2782761982 From kevinw at openjdk.org Mon Apr 7 11:36:55 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 7 Apr 2025 11:36:55 GMT Subject: RFR: 8353231: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 10:27:39 GMT, Kevin Walls wrote: > Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently. > On failure, 10 attempts with sleep(200) each time, only read -1 from mbean.getProcessCpuLoad(). > The method is documented to return -1 when info is not available, but want to avoid the test accepting a -1 and masking real problems. > > Test failures are happening when multiple CPU load reding tests ran on the same host, at the same second. > Add a TEST.properties file containing: exclusiveAccess.dirs=. Thanks David! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24352#issuecomment-2783014959 From kevinw at openjdk.org Mon Apr 7 11:36:55 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 7 Apr 2025 11:36:55 GMT Subject: Integrated: 8353231: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 10:27:39 GMT, Kevin Walls wrote: > Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently. > On failure, 10 attempts with sleep(200) each time, only read -1 from mbean.getProcessCpuLoad(). > The method is documented to return -1 when info is not available, but want to avoid the test accepting a -1 and masking real problems. > > Test failures are happening when multiple CPU load reding tests ran on the same host, at the same second. > Add a TEST.properties file containing: exclusiveAccess.dirs=. This pull request has now been integrated. Changeset: e8c9e5c6 Author: Kevin Walls URL: https://git.openjdk.org/jdk/commit/e8c9e5c6cd3c844765c27c068022a018914fdf4e Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8353231: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad still fails intermittently Reviewed-by: dholmes ------------- PR: https://git.openjdk.org/jdk/pull/24352 From zgu at openjdk.org Mon Apr 7 12:35:53 2025 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 7 Apr 2025 12:35:53 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p In-Reply-To: References: Message-ID: On Mon, 7 Apr 2025 09:05:34 GMT, Kevin Walls wrote: > This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. > The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. > It has always done a manual "root plus pid plus extension" on the default filename only, and > should move to using Argument::copy_expand_pid() like we do with other such filenames. > > > We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). @kevinjwalls I have [JDK-8349083](https://bugs.openjdk.org/browse/JDK-8349083) to address similar issues. AFAICT, there are 3 separate code to handle filename expansion and logging has the most complete support, It will be nice to unify them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24482#issuecomment-2783181347 From kevinw at openjdk.org Mon Apr 7 15:01:34 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 7 Apr 2025 15:01:34 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p In-Reply-To: References: Message-ID: On Mon, 7 Apr 2025 12:32:52 GMT, Zhengyu Gu wrote: > @kevinjwalls I have [JDK-8349083](https://bugs.openjdk.org/browse/JDK-8349083) to address similar issues. > > AFAICT, there are 3 separate code to handle filename expansion and logging has the most complete support, It will be nice to unify them. Hi, thanks for the pointer. Yes, we have some duplication in this area... This change is quite small, and removes one duplicate, the manual "base+pid+extension" creation of the filename in HeapDumper. I am looking at the other PR, maybe we can make them share more in future... ------------- PR Comment: https://git.openjdk.org/jdk/pull/24482#issuecomment-2783639159 From heidinga at openjdk.org Mon Apr 7 17:01:56 2025 From: heidinga at openjdk.org (Dan Heidinga) Date: Mon, 7 Apr 2025 17:01:56 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v3] In-Reply-To: References: Message-ID: <1ssLzAJVy1yrmNRCmTCz---JPV7_cLWOLwu0A3-yhZw=.02dbfc81-0a2e-4a4a-a7c1-bae8a93af621@github.com> On Thu, 3 Apr 2025 20:31:50 GMT, Ioi Lam wrote: >> Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. >> >> In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: >> >> >> const char* CDSConfig::input_static_archive_path(); >> const char* CDSConfig::input_dynamic_archive_path(); >> const char* CDSConfig::output_archive_path(); >> >> >> This PR also cleans up the code by: >> - renaming a few function to reflect what they actually do >> - moving more "config" management code into cdsConfig.cpp >> >> There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. >> >> However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @lmesnik comments src/hotspot/share/cds/cdsConfig.cpp line 598: > 596: // - SharedArchiveFile is not specified and the VM doesn't have a compatible default archive > 597: > 598: #define __THEMSG " is unsupported when base CDS archive is not loaded. Run with -Xlog:cds for more info." Do we want to start transitioning existing `-Xlog:cds` options to be `:aot` options? I think making the switch would match out long term direction ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2031644463 From amenkov at openjdk.org Mon Apr 7 18:06:49 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Mon, 7 Apr 2025 18:06:49 GMT Subject: RFR: 8353485: Jcms should allow to specify streaming_output mode Message-ID: The fix adds `--streaming_output` jcmd option to manage attach command streaming output. Testing: tier1..tier4,hs-tier5-svc ------------- Commit messages: - jcmd_streaming Changes: https://git.openjdk.org/jdk/pull/24494/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24494&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353485 Stats: 172 lines in 9 files changed: 144 ins; 7 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/24494.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24494/head:pull/24494 PR: https://git.openjdk.org/jdk/pull/24494 From cjplummer at openjdk.org Tue Apr 8 01:41:39 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 8 Apr 2025 01:41:39 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p In-Reply-To: References: Message-ID: On Mon, 7 Apr 2025 09:05:34 GMT, Kevin Walls wrote: > This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. > The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. > It has always done a manual "root plus pid plus extension" on the default filename only, and > should move to using Argument::copy_expand_pid() like we do with other such filenames. > > > We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). src/hotspot/share/services/heapDumper.cpp line 2772: > 2770: > 2771: // Set base path (name or directory, default or custom, without seq no), doing %p substitution. > 2772: const char *path_src = (HeapDumpPath && HeapDumpPath[0] != '\0') ? HeapDumpPath : dump_file_name; Should be `HeapDumpPath != nullptr`. src/hotspot/share/services/heapDumper.cpp line 2792: > 2790: // Path is a directory. Append the default name, with %p substitution. Use my_path temporarily. > 2791: if (!Arguments::copy_expand_pid(dump_file_name, strlen(dump_file_name), my_path, JVM_MAXPATHLEN)) { > 2792: warning("Cannot create heap dump file. HeapDumpPath is too long."); What is going to be the end result of this? A truncated file name? test/hotspot/jtreg/runtime/ErrorHandling/TestHeapDumpOnOutOfMemoryError.java line 101: > 99: File dump = new File(heapdumpFilename); > 100: Asserts.assertTrue(dump.exists() && dump.isFile(), "Could not find dump file " + dump.getAbsolutePath()); > 101: I think you can remove this empty line, especially since you don't have one in the similar code below. test/hotspot/jtreg/runtime/ErrorHandling/TestHeapDumpOnOutOfMemoryError.java line 113: > 111: TestHeapDumpOnOutOfMemoryError.class.getName(), type); > 112: > 113: Process proc = pb.start(); No need differ from the above code here. You can just use OutputAnalyzer.pid() to get the pid in the code below. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2032238479 PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2032243030 PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2032233983 PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2032234654 From jiangli at openjdk.org Tue Apr 8 02:31:57 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 8 Apr 2025 02:31:57 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK Message-ID: Please review this PR that changes `LoadAgentDcmdTest.getLibInstrumentPath()` to not locate `libinstrument` shared library if running on static JDK, instead just returns "libinstrument." directly. Both test case #1 and #2 in `LoadAgentDcmdTest.run()` run ok on static JDK with the change: Commands: Test case #1: `JVMTI.agent_load libinstrument.so agent.jar` Test case #2: `JVMTI.agent_load libinstrument.so "agent.jar=foo=bar"` Additional notes/considerations: 1. I notice [JDKToolFinder.getJDKTool()](https://github.com/openjdk/jdk/blob/80ff7b9c9406c7845ecb3bc40910e92ccdd23ff2/test/lib/jdk/test/lib/JDKToolFinder.java#L41) is used by the test to find the `jcmd` tool. `JDKToolFinder.getJDKTool()` looks for the requested tool in both `test.jdk` and `compile.jdk`. When running the jtreg test on static JDK, it's able to locate the `jcmd` from `compile.jdk` even though the static JDK binary (`test.jdk`) does not provide the tool. For tools used by jtreg tests at runtime, how about change to always do that? 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: JVMTI.agent_load [arguments] Loads JVMTI native agent. Impact: Low arguments: library path: Absolute path of the JVMTI agent to load. (STRING, no default value) agent option: (Optional) Option string to pass the agent. (STRING, no default value) The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. ------------- Commit messages: - Replace "so" with Platform.sharedLibraryExt(). - Change LoadAgentDcmdTest.getLibInstrumentPath() to return "libinstrument.so" directly without locating the shared library if running on static JDK. Changes: https://git.openjdk.org/jdk/pull/24497/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24497&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353938 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24497.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24497/head:pull/24497 PR: https://git.openjdk.org/jdk/pull/24497 From sspitsyn at openjdk.org Tue Apr 8 03:16:30 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 8 Apr 2025 03:16:30 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v3] In-Reply-To: References: Message-ID: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge - some cleanup - 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24269/files - new: https://git.openjdk.org/jdk/pull/24269/files/18944347..4a92986a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=01-02 Stats: 68856 lines in 1040 files changed: 24634 ins; 41255 del; 2967 mod Patch: https://git.openjdk.org/jdk/pull/24269.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24269/head:pull/24269 PR: https://git.openjdk.org/jdk/pull/24269 From kevinw at openjdk.org Tue Apr 8 09:02:13 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 8 Apr 2025 09:02:13 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 01:37:21 GMT, Chris Plummer wrote: >> This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. >> The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. >> It has always done a manual "root plus pid plus extension" on the default filename only, and >> should move to using Argument::copy_expand_pid() like we do with other such filenames. >> >> >> We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). > > src/hotspot/share/services/heapDumper.cpp line 2792: > >> 2790: // Path is a directory. Append the default name, with %p substitution. Use my_path temporarily. >> 2791: if (!Arguments::copy_expand_pid(dump_file_name, strlen(dump_file_name), my_path, JVM_MAXPATHLEN)) { >> 2792: warning("Cannot create heap dump file. HeapDumpPath is too long."); > > What is going to be the end result of this? A truncated file name? Yes the other warnings return - thanks. They all return without incrementing dump_seq, so will hit the same failure each time. Setting a HeapDumpPath near to 4k in length is not an efficient thing to do! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2032733570 From rehn at openjdk.org Tue Apr 8 09:44:23 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 8 Apr 2025 09:44:23 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v3] In-Reply-To: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> References: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> Message-ID: On Mon, 31 Mar 2025 10:45:54 GMT, Robbin Ehn wrote: >> Hi, for you to consider. >> >> These tests constantly fails in qemu-user. >> Either the require host to be same arch explicit or implicit (sysroot). >> E.g. "ptrace(PTRACE_ATTACH, ..) failed for 405157: Function not implemented'" for SA tests. >> >> From bug: >>> qemu-user/rv64 sets uarch to "qemu" in /proc/cpuinfo (qemu-system do not do that). >>> We add this uarch to CPU feature string. >>> This means we can use jtreg 'require' with cpu string to filter out tests in qemu-user. >> >> Relevant qemu code: >> https://github.com/qemu/qemu/blob/170825d14d88a1ce7fae98d5a928480f2f329b22/linux-user/riscv/target_proc.h#L29 >> >> Relevant hotspot code: >> https://github.com/openjdk/jdk/blob/fa0b18bfde38ee2ffbab33a9eaac547fe8aa3c7c/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L250 >> >> Tested that the require only filters out tests in qemu+riscv64. >> >> Thanks! >> >> /Robbin > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into qemu-user-issues > - Revert > - Merge branch 'master' into qemu-user-issues > - Merge branch 'master' into qemu-user-issues > - more > - more > - native or very long Any takers? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2785851072 From kevinw at openjdk.org Tue Apr 8 11:43:20 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 8 Apr 2025 11:43:20 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: > This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. > The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. > It has always done a manual "root plus pid plus extension" on the default filename only, and > should move to using Argument::copy_expand_pid() like we do with other such filenames. > > > We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: - length checking update - Chris feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24482/files - new: https://git.openjdk.org/jdk/pull/24482/files/ab82116e..c32e4ca4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24482&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24482&range=00-01 Stats: 23 lines in 2 files changed: 1 ins; 15 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24482.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24482/head:pull/24482 PR: https://git.openjdk.org/jdk/pull/24482 From kevinw at openjdk.org Tue Apr 8 11:47:20 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 8 Apr 2025 11:47:20 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 11:43:20 GMT, Kevin Walls wrote: >> This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. >> The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. >> It has always done a manual "root plus pid plus extension" on the default filename only, and >> should move to using Argument::copy_expand_pid() like we do with other such filenames. >> >> >> We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - length checking update > - Chris feedback Updated. Additionally, the total_length check at line 2760 is wrong now. But it is also redundant, we use copy_expand_pid to do our length checks on expansion. Use max_digit_chars to reduce buffer length in those copy_expand_pid calls, to leave room for possible later sequence numbers (this is very conservative). On longer path lengths, worth noting that using MAXPATHLEN (4k) is higher than outputStream::print_cr allows. This means we can get through all of HeapDumper::dump_heap(bool oome) and call HeapDumper::dump() which uses: 2606 out->print_cr("Dumping heap to %s ...", path); ..and will show a VM warning like "outputStream::do_vsnprintf output truncated" Again, very very long HeapDumpPaths are not efficient. 8-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24482#issuecomment-2786165819 From stuefe at openjdk.org Tue Apr 8 11:54:21 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 8 Apr 2025 11:54:21 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 11:44:52 GMT, Kevin Walls wrote: > Updated. Additionally, the total_length check at line 2760 is wrong now. But it is also redundant, we use copy_expand_pid to do our length checks on expansion. Use max_digit_chars to reduce buffer length in those copy_expand_pid calls, to leave room for possible later sequence numbers (this is very conservative). > > On longer path lengths, worth noting that using MAXPATHLEN (4k) is higher than outputStream::print_cr allows. This means we can get through all of HeapDumper::dump_heap(bool oome) and call HeapDumper::dump() which uses: 2606 out->print_cr("Dumping heap to %s ...", path); ..and will show a VM warning like "outputStream::do_vsnprintf output truncated" > > Again, very very long HeapDumpPaths are not efficient. 8-) I keep thinking that such a coding can benefit from using stringStream; it makes most of the associated buffer counting etc obsolete, supports dynamic buffers, or optionally can be laid over a fixed-sized buffer and then handles truncation. It does not yet provide a way to report truncation, but that can be added really easily. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24482#issuecomment-2786180952 From kevinw at openjdk.org Tue Apr 8 12:12:11 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 8 Apr 2025 12:12:11 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 11:51:18 GMT, Thomas Stuefe wrote: > I keep thinking that such a coding can benefit from using stringStream; it makes most of the associated buffer counting etc obsolete, supports dynamic buffers, or optionally can be laid over a fixed-sized buffer and then handles truncation. It does not yet provide a way to report truncation, but that can be added really easily. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24482#issuecomment-2786226321 From stuefe at openjdk.org Tue Apr 8 13:16:20 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 8 Apr 2025 13:16:20 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 11:43:20 GMT, Kevin Walls wrote: >> This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. >> The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. >> It has always done a manual "root plus pid plus extension" on the default filename only, and >> should move to using Argument::copy_expand_pid() like we do with other such filenames. >> >> >> We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - length checking update > - Chris feedback src/hotspot/share/services/heapDumper.cpp line 2779: > 2777: } > 2778: // Then add the default name, with %p substitution. Use my_path temporarily. > 2779: if (!Arguments::copy_expand_pid(dump_file_name, strlen(dump_file_name), my_path, JVM_MAXPATHLEN - max_digit_chars)) { IIUC there is a pre-existing bug, and if I am right one you should fix: this calculation assumes that there is only a single %p. There may be multiple. Many. E.g. as a malicious attempt to cause a buffer overflow. This is what I meant with stringStream. stringStream offers protection against stuff like that without the manual buffer counting headaches. I would give Arguments a method like this: print_expand_pid(outputStream* sink, const char* input); and in there print to sink, with print or putc. This would never truncate. Then use it like this: outputStream st(caller buffer, caller buffer size) if (have HeapDumpPath) { Arguments::print_expand_pid(st, HeapDumpPath); if (st->was_truncated()) return with warning // now st->base() ist der expanded heap path. Test if its a directory etc } // append file name Arguments::print_expand_pid(st, dump_file_name); if (st->was_truncated()) return with warning Just a rough sketch. And fine for followup PRs, though I think it may make your life easier if you do it now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2033167264 From kevinw at openjdk.org Tue Apr 8 13:50:19 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 8 Apr 2025 13:50:19 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 13:13:25 GMT, Thomas Stuefe wrote: >> Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: >> >> - length checking update >> - Chris feedback > > src/hotspot/share/services/heapDumper.cpp line 2779: > >> 2777: } >> 2778: // Then add the default name, with %p substitution. Use my_path temporarily. >> 2779: if (!Arguments::copy_expand_pid(dump_file_name, strlen(dump_file_name), my_path, JVM_MAXPATHLEN - max_digit_chars)) { > > IIUC there is a pre-existing bug, and if I am right one you should fix: this calculation assumes that there is only a single %p. There may be multiple. Many. E.g. as a malicious attempt to cause a buffer overflow. > > This is what I meant with stringStream. stringStream offers protection against stuff like that without the manual buffer counting headaches. I would give Arguments a method like this: > > print_expand_pid(outputStream* sink, const char* input); > > > and in there print to sink, with print or putc. This would never truncate. Then use it like this: > > > outputStream st(caller buffer, caller buffer size) > if (have HeapDumpPath) { > Arguments::print_expand_pid(st, HeapDumpPath); > if (st->was_truncated()) return with warning > // now st->base() ist der expanded heap path. Test if its a directory etc > } > // append file name > Arguments::print_expand_pid(st, dump_file_name); > if (st->was_truncated()) return with warning > > > Just a rough sketch. And fine for followup PRs, though I think it may make your life easier if you do it now. Thankfully copy_expand_pid does handle multiple %p replacements. It seems good to use that to check the buffer length, partly for that reason, as just knowing a max number of digits wasn't so flexible if many %p were present. Thanks for the other ideas! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2033234374 From stefank at openjdk.org Tue Apr 8 14:22:27 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 8 Apr 2025 14:22:27 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v3] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 15:27:15 GMT, Robert Toyonaga wrote: >> ### Update: >> After some discussion it was decided it's not necessary to expand the lock scope for reserve/commit. Instead, we are opting to add comments explaining the reasons for locking and the conditions to avoid which could lead to races. Some of the new tests can be kept because they are general enough to be useful outside of this context. >> >> ### Summary: >> This PR makes memory operations atomic with NMT accounting. >> >> ### The problem: >> In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. >> >> 1.1 Thread_1 releases range_A. >> 1.2 Thread_1 tells NMT "range_A has been released". >> >> 2.1 Thread_2 reserves (the now free) range_A. >> 2.2 Thread_2 tells NMT "range_A is reserved". >> >> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. >> >> ### Solution: >> Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. >> >> ### Other notes: >> I also simplified this pattern found in many places: >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` >> This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. >> >> I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker:... > > Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: > > exclude file mapping tests on AIX. I think this looks good to me, but please seek feedback from others as well. I've added a couple of suggestions. None of them are required, but I think they would be nice to do. src/hotspot/share/runtime/os.cpp line 2206: > 2204: // when it is actually committed. The opposite scenario is not guarded against. pd_commit_memory and > 2205: // record_virtual_memory_commit do not happen atomically. We assume that there is some external synchronization > 2206: // that prevents a region from being uncommitted before it is finished being committed. It's not a requirement, but you get kudos from me if you keep comments lines below 80 lines. I typically don't like code to be 80 lines, but comments tend to be nicer if they are. test/hotspot/gtest/runtime/test_os.cpp line 1123: > 1121: > 1122: char* base = os::reserve_memory(size, false, mtTest); > 1123: ASSERT_NE(base, (char*) nullptr); Suggestion: ASSERT_NOT_NULL(base); And the same in other places. test/hotspot/gtest/runtime/test_os.cpp line 1133: > 1131: } > 1132: > 1133: #if !defined(_AIX) Suggestion: #if !defined(_AIX) I suggest a blank line here because this ifdef spans multiple tests and not only the nearest test. Having a blank line makes it clearer that this is a large ifdef that is not only related to the test case that it is bunched up against. test/hotspot/gtest/runtime/test_os.cpp line 1145: > 1143: EXPECT_TRUE(result != nullptr); > 1144: > 1145: EXPECT_TRUE(strcmp(letters, result)==0); Suggestion: EXPECT_TRUE(strcmp(letters, result) == 0); but probably even better: Suggestion: EXPECT_EQ(strcmp(letters, result), 0); test/hotspot/gtest/runtime/test_os.cpp line 1184: > 1182: ::close(fd); > 1183: } > 1184: #endif Suggestion: #endif // !defined(_AIX) I suggest a blank line and a matching comment. I know some HotSpots devs tend to appreciate those comments. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24084#pullrequestreview-2750137709 PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033287481 PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033292443 PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033303666 PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033294266 PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033307030 From stefank at openjdk.org Tue Apr 8 14:22:27 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 8 Apr 2025 14:22:27 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v3] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 14:11:21 GMT, Stefan Karlsson wrote: >> Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: >> >> exclude file mapping tests on AIX. > > test/hotspot/gtest/runtime/test_os.cpp line 1145: > >> 1143: EXPECT_TRUE(result != nullptr); >> 1144: >> 1145: EXPECT_TRUE(strcmp(letters, result)==0); > > Suggestion: > > EXPECT_TRUE(strcmp(letters, result) == 0); > > but probably even better: > Suggestion: > > EXPECT_EQ(strcmp(letters, result), 0); There are more places like this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033296961 From alanb at openjdk.org Tue Apr 8 14:29:50 2025 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 8 Apr 2025 14:29:50 GMT Subject: RFR: 8351927: Change VirtualThread implementation to use use FJP delayed task handling Message-ID: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> Follow up to JDK-8319447 to change the VirtualThread implementation to use FJP's delayed task handling. The SPTE based implementation is not removed. It will continue to be used by tests. If custom schedulers are exposed in the future then they will use this implementation. For timed-Object.wait, waitTimeoutExpired is changed to use lazySubmit to avoid signalling and increase the chance that the unparked virtual thread will continue on the current carrier. For timed-park, the timeout task is changed to reduced form of unpark that also uses lazySubmit, for the same reason. `jcmd Thread.vthread_scheduler` is changed to no longer print the delay schedulers. Instead, the delayed task count will appear in the default scheduler output. ------------- Commit messages: - Merge branch 'master' into JDK-8351927 - Merge branch 'master' into JDK-8351927 - Merge branch 'master' into JDK-8351927 - Merge - Merge branch 'pull/23702' into JDK-8351927 - Typo - Address review comments - Merge branch 'openjdk:master' into JDK-8319447 - Address review comments - Update - ... and 51 more: https://git.openjdk.org/jdk/compare/867a0301...30eba776 Changes: https://git.openjdk.org/jdk/pull/24030/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24030&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351927 Stats: 405 lines in 8 files changed: 332 ins; 41 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/24030.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24030/head:pull/24030 PR: https://git.openjdk.org/jdk/pull/24030 From mbaesken at openjdk.org Tue Apr 8 14:46:30 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 8 Apr 2025 14:46:30 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: <-ut2L3mbuwzJxlZJGiGyhMl7_HCpd1QX-kuFOX8m1Lc=.38fc5800-855a-49cd-9f3e-3ccadd19885b@github.com> On Tue, 1 Apr 2025 11:59:07 GMT, Magnus Ihse Bursie wrote: > It would be interesting to also see how compilation times varies with optimization level. At least some kind of hint if HIGHEST is like 2x slower than LOW, or if SIZE is slower than LOW at all, etc. The relative speed difference is interesting, but so is it in absolute terms. If a library takes 0.5 seconds on LOW but 1.1 seconds on HIGH on a particular system, it is unlikely to matter much to overall build time anywhere. But if it goes from 15s to 30s on a fast machine, it might be a problem if such performance regressions stack up, especially on slower machines (which includes the ones running GHA). This is what I got from my Linux x86_64 system using gcc 13.2.0 devkit (opt build). Note that the build operates on a relatively slow filer, this will slow the build time somewhat but that is true for all opt-levels. rm -rf ./support/modules_libs/jdk.jdwp.agent/libjdwp.so ./jdk/lib/libjdwp.so ./support/native/jdk.jdwp.agent/libjdwp time make jdk.jdwp.agent-libs-only JOBS=1 gave me these times **default (LOW)** real 0m15.661s user 0m8.763s sys 0m2.012s **HIGHEST** real 0m15.201s user 0m9.005s sys 0m2.003s **SIZE** real 0m14.263s user 0m7.905s sys 0m1.891s So it looks like SIZE is a little faster than the other, and LOW and HIGHEST are rather similar. LOW is `-O2` on Linuxx86_64 and HIGHEST is `-O3` , those are maybe rather similar (LOW is a bit misleading because `-O2` is not really that 'low' the gcc docu says about it : 'Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff.'). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2786694168 From stuefe at openjdk.org Tue Apr 8 15:28:25 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 8 Apr 2025 15:28:25 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 13:47:10 GMT, Kevin Walls wrote: >> src/hotspot/share/services/heapDumper.cpp line 2779: >> >>> 2777: } >>> 2778: // Then add the default name, with %p substitution. Use my_path temporarily. >>> 2779: if (!Arguments::copy_expand_pid(dump_file_name, strlen(dump_file_name), my_path, JVM_MAXPATHLEN - max_digit_chars)) { >> >> IIUC there is a pre-existing bug, and if I am right one you should fix: this calculation assumes that there is only a single %p. There may be multiple. Many. E.g. as a malicious attempt to cause a buffer overflow. >> >> This is what I meant with stringStream. stringStream offers protection against stuff like that without the manual buffer counting headaches. I would give Arguments a method like this: >> >> print_expand_pid(outputStream* sink, const char* input); >> >> >> and in there print to sink, with print or putc. This would never truncate. Then use it like this: >> >> >> outputStream st(caller buffer, caller buffer size) >> if (have HeapDumpPath) { >> Arguments::print_expand_pid(st, HeapDumpPath); >> if (st->was_truncated()) return with warning >> // now st->base() ist der expanded heap path. Test if its a directory etc >> } >> // append file name >> Arguments::print_expand_pid(st, dump_file_name); >> if (st->was_truncated()) return with warning >> >> >> Just a rough sketch. And fine for followup PRs, though I think it may make your life easier if you do it now. > > Thankfully copy_expand_pid does handle multiple %p replacements. It seems good to use that to check the buffer length, partly for that reason, as just knowing a max number of digits wasn't so flexible if many %p were present. > > Thanks for the other ideas! Ah okay, it checks for overflow. Okay, please disregard half of what I have written :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2033452540 From kvn at openjdk.org Tue Apr 8 16:00:21 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 8 Apr 2025 16:00:21 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v3] In-Reply-To: <1ssLzAJVy1yrmNRCmTCz---JPV7_cLWOLwu0A3-yhZw=.02dbfc81-0a2e-4a4a-a7c1-bae8a93af621@github.com> References: <1ssLzAJVy1yrmNRCmTCz---JPV7_cLWOLwu0A3-yhZw=.02dbfc81-0a2e-4a4a-a7c1-bae8a93af621@github.com> Message-ID: On Mon, 7 Apr 2025 16:54:37 GMT, Dan Heidinga wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @lmesnik comments > > src/hotspot/share/cds/cdsConfig.cpp line 598: > >> 596: // - SharedArchiveFile is not specified and the VM doesn't have a compatible default archive >> 597: >> 598: #define __THEMSG " is unsupported when base CDS archive is not loaded. Run with -Xlog:cds for more info." > > Do we want to start transitioning existing `-Xlog:cds` options to be `:aot` options? I think making the switch would match out long term direction Yes, but I think we should do it only if `AOTClassLinking` is enabled. For legacy CDS we should continue use `-Xlog:cds`. I am using `-Xlog:aot+codecache` in AOT code caching. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2033524434 From iklam at openjdk.org Tue Apr 8 16:29:21 2025 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 8 Apr 2025 16:29:21 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v3] In-Reply-To: References: <1ssLzAJVy1yrmNRCmTCz---JPV7_cLWOLwu0A3-yhZw=.02dbfc81-0a2e-4a4a-a7c1-bae8a93af621@github.com> Message-ID: On Tue, 8 Apr 2025 15:57:46 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/cds/cdsConfig.cpp line 598: >> >>> 596: // - SharedArchiveFile is not specified and the VM doesn't have a compatible default archive >>> 597: >>> 598: #define __THEMSG " is unsupported when base CDS archive is not loaded. Run with -Xlog:cds for more info." >> >> Do we want to start transitioning existing `-Xlog:cds` options to be `:aot` options? I think making the switch would match out long term direction > > Yes, but I think we should do it only if `AOTClassLinking` is enabled. For legacy CDS we should continue use `-Xlog:cds`. > I am using `-Xlog:aot+codecache` in AOT code caching. I created [JDK-8354055 - Change "cds" logging tag to "aot"](https://bugs.openjdk.org/browse/JDK-8354055). There are documentation/compatibility issues so we need to do some planning. This particular block of code is moved from dynamicArchive.cpp to cdsConfig.cpp and I kept the logging tag the same. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2033582112 From heidinga at openjdk.org Tue Apr 8 16:56:17 2025 From: heidinga at openjdk.org (Dan Heidinga) Date: Tue, 8 Apr 2025 16:56:17 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v3] In-Reply-To: References: <1ssLzAJVy1yrmNRCmTCz---JPV7_cLWOLwu0A3-yhZw=.02dbfc81-0a2e-4a4a-a7c1-bae8a93af621@github.com> Message-ID: On Tue, 8 Apr 2025 16:26:28 GMT, Ioi Lam wrote: >> Yes, but I think we should do it only if `AOTClassLinking` is enabled. For legacy CDS we should continue use `-Xlog:cds`. >> I am using `-Xlog:aot+codecache` in AOT code caching. > > I created [JDK-8354055 - Change "cds" logging tag to "aot"](https://bugs.openjdk.org/browse/JDK-8354055). There are documentation/compatibility issues so we need to do some planning. > > This particular block of code is moved from dynamicArchive.cpp to cdsConfig.cpp and I kept the logging tag the same. Thanks @iklam. I agree with the approach of doing this in a separate issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24401#discussion_r2033640450 From kvn at openjdk.org Tue Apr 8 17:28:26 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 8 Apr 2025 17:28:26 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v3] In-Reply-To: References: Message-ID: <2ZMtE0qPWzJ1jSC31oFURQ2b7w3l7d8CJsXgmWauyiY=.e4a74cdf-d76a-4862-8966-eb26a0b57dac@github.com> On Thu, 3 Apr 2025 20:31:50 GMT, Ioi Lam wrote: >> Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. >> >> In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: >> >> >> const char* CDSConfig::input_static_archive_path(); >> const char* CDSConfig::input_dynamic_archive_path(); >> const char* CDSConfig::output_archive_path(); >> >> >> This PR also cleans up the code by: >> - renaming a few function to reflect what they actually do >> - moving more "config" management code into cdsConfig.cpp >> >> There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. >> >> However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @lmesnik comments Looks good to me. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24401#pullrequestreview-2750854771 From jiangli at openjdk.org Tue Apr 8 18:55:21 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 8 Apr 2025 18:55:21 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 02:25:58 GMT, Jiangli Zhou wrote: > 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: > > ``` > JVMTI.agent_load [arguments] > Loads JVMTI native agent. > Impact: Low > arguments: > library path: Absolute path of the JVMTI agent to load. (STRING, no default value) > agent option: (Optional) Option string to pass the agent. (STRING, no default value) > ``` > > The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. I discussed the `JVMTI.agent_load` issue on static support with @AlanBateman this morning as part of the hermetic Java meeting. @AlanBateman suggested considering adding an alternative diagnostic command or argument for the static (built-in) agent load support. We also discussed the use case of statically-linked dynamic (attached) native agents loaded by `jcmd` tool on static, and considerations for resolving (or clarifying) the `JVMTI.agent_load` static support at a later point when usage requirements were more clear. Outside the jtreg tests, I haven't run into any such usages yet. Based on the discussion, I'll update this PR to skip the `LoadAgentDcmdTest` on static JDK for now. I'll file a separate bug for the `JVMTI.agent_load` issue on static support so we can revisit that when things are more clear. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24497#issuecomment-2787388528 From jiangli at openjdk.org Tue Apr 8 19:09:02 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 8 Apr 2025 19:09:02 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK [v2] In-Reply-To: References: Message-ID: > Please review this PR that changes `LoadAgentDcmdTest.getLibInstrumentPath()` to not locate `libinstrument` shared library if running on static JDK, instead just returns "libinstrument." directly. Both test case #1 and #2 in `LoadAgentDcmdTest.run()` run ok on static JDK with the change: > > Commands: > Test case #1: `JVMTI.agent_load libinstrument.so agent.jar` > Test case #2: `JVMTI.agent_load libinstrument.so "agent.jar=foo=bar"` > > Additional notes/considerations: > > 1. I notice [JDKToolFinder.getJDKTool()](https://github.com/openjdk/jdk/blob/80ff7b9c9406c7845ecb3bc40910e92ccdd23ff2/test/lib/jdk/test/lib/JDKToolFinder.java#L41) is used by the test to find the `jcmd` tool. `JDKToolFinder.getJDKTool()` looks for the requested tool in both `test.jdk` and `compile.jdk`. When running the jtreg test on static JDK, it's able to locate the `jcmd` from `compile.jdk` even though the static JDK binary (`test.jdk`) does not provide the tool. For tools used by jtreg tests at runtime, how about change to always do that? > > 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: > > JVMTI.agent_load [arguments] > Loads JVMTI native agent. > Impact: Low > arguments: > library path: Absolute path of the JVMTI agent to load. (STRING, no default value) > agent option: (Optional) Option string to pass the agent. (STRING, no default value) > > The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: Skip LoadAgentDcmdTest on static JDK. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24497/files - new: https://git.openjdk.org/jdk/pull/24497/files/1e61a0dd..3b4ef50c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24497&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24497&range=00-01 Stats: 7 lines in 1 file changed: 1 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24497.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24497/head:pull/24497 PR: https://git.openjdk.org/jdk/pull/24497 From jiangli at openjdk.org Tue Apr 8 19:35:23 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 8 Apr 2025 19:35:23 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 19:09:02 GMT, Jiangli Zhou wrote: >> Please review this PR that changes `LoadAgentDcmdTest.getLibInstrumentPath()` to not locate `libinstrument` shared library if running on static JDK, instead just returns "libinstrument." directly. Both test case #1 and #2 in `LoadAgentDcmdTest.run()` run ok on static JDK with the change: >> >> Commands: >> Test case #1: `JVMTI.agent_load libinstrument.so agent.jar` >> Test case #2: `JVMTI.agent_load libinstrument.so "agent.jar=foo=bar"` >> >> Additional notes/considerations: >> >> 1. I notice [JDKToolFinder.getJDKTool()](https://github.com/openjdk/jdk/blob/80ff7b9c9406c7845ecb3bc40910e92ccdd23ff2/test/lib/jdk/test/lib/JDKToolFinder.java#L41) is used by the test to find the `jcmd` tool. `JDKToolFinder.getJDKTool()` looks for the requested tool in both `test.jdk` and `compile.jdk`. When running the jtreg test on static JDK, it's able to locate the `jcmd` from `compile.jdk` even though the static JDK binary (`test.jdk`) does not provide the tool. For tools used by jtreg tests at runtime, how about change to always do that? >> >> 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: >> >> JVMTI.agent_load [arguments] >> Loads JVMTI native agent. >> Impact: Low >> arguments: >> library path: Absolute path of the JVMTI agent to load. (STRING, no default value) >> agent option: (Optional) Option string to pass the agent. (STRING, no default value) >> >> The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. > > Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: > > Skip LoadAgentDcmdTest on static JDK. I filed https://bugs.openjdk.org/browse/JDK-8354069. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24497#issuecomment-2787473905 From vklang at openjdk.org Tue Apr 8 20:09:25 2025 From: vklang at openjdk.org (Viktor Klang) Date: Tue, 8 Apr 2025 20:09:25 GMT Subject: RFR: 8351927: Change VirtualThread implementation to use use FJP delayed task handling In-Reply-To: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> References: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> Message-ID: <64xe_dZIfIScyB_InbjJJn19DjAY_7qxmsB7FKKKIH4=.0c4ad76a-cbf8-475d-b3e8-6c2336ec1f29@github.com> On Thu, 13 Mar 2025 10:48:14 GMT, Alan Bateman wrote: > Follow up to JDK-8319447 to change the VirtualThread implementation to use FJP's delayed task handling. > > The SPTE based implementation is not removed. It will continue to be used by tests. If custom schedulers are exposed in the future then they will use this implementation. > > For timed-Object.wait, waitTimeoutExpired is changed to use lazySubmit to avoid signalling and increase the chance that the unparked virtual thread will continue on the current carrier. For timed-park, the timeout task is changed to reduced form of unpark that also uses lazySubmit, for the same reason. > > `jcmd Thread.vthread_scheduler` is changed to no longer print the delay schedulers. Instead, the delayed task count will appear in the default scheduler output. src/java.base/share/classes/java/lang/VirtualThread.java line 889: > 887: private void parkTimeoutExpired() { > 888: assert !VirtualThread.currentThread().isVirtual(); > 889: if (!getAndSetParkPermit(true) @AlanBateman Would it make sense to test whether the park-permit is false before the LOCK XCHG? src/java.base/share/classes/java/lang/VirtualThread.java line 1455: > 1453: return pool.schedule(command, delay, unit); > 1454: } else { > 1455: return DelayedTaskSchedulers.schedule(command, delay, unit); @AlanBateman Would it make sense to test if the Scheduler implements ScheduledExecutorService? src/java.base/share/classes/java/lang/VirtualThread.java line 1462: > 1460: * Supports scheduling a runnable task to run after a delay. It uses a number > 1461: * of ScheduledThreadPoolExecutor instances to reduce contention on the delayed > 1462: * work queue used. This class is used when using a custom scheduler. @AlanBateman It might make sense to instead require a custom Scheduler to implement ScheduledExecutorService? ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24030#discussion_r2033942539 PR Review Comment: https://git.openjdk.org/jdk/pull/24030#discussion_r2033943663 PR Review Comment: https://git.openjdk.org/jdk/pull/24030#discussion_r2033944680 From lmesnik at openjdk.org Wed Apr 9 00:06:41 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 9 Apr 2025 00:06:41 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 11:43:20 GMT, Kevin Walls wrote: >> This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. >> The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. >> It has always done a manual "root plus pid plus extension" on the default filename only, and >> should move to using Argument::copy_expand_pid() like we do with other such filenames. >> >> >> We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - length checking update > - Chris feedback Marked as reviewed by lmesnik (Reviewer). test/hotspot/jtreg/runtime/ErrorHandling/TestHeapDumpOnOutOfMemoryError.java line 100: > 98: output.shouldContain("Dumping heap to " + type + ".hprof"); > 99: File dump = new File(heapdumpFilename); > 100: Asserts.assertTrue(dump.exists() && dump.isFile(), "Could not find dump file " + dump.getAbsolutePath()); I. think you could just update the test to use heapdumpFilename = type + ".%p.hprof"; we don't need test twice, it is quite expensive. test/hotspot/jtreg/runtime/ErrorHandling/TestHeapDumpOnOutOfMemoryError.java line 115: > 113: output.stdoutShouldNotBeEmpty(); > 114: String actualHeapdumpFilename = type + "." + output.pid() + ".hprof"; > 115: output.shouldContain("Dumping heap to " + actualHeapdumpFilename); This better to be something like expectedlHeapdumpFilename and "Expected heap dump file". Not very important, but make log cleaner. ------------- PR Review: https://git.openjdk.org/jdk/pull/24482#pullrequestreview-2751644078 PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2034187020 PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2034189152 From iklam at openjdk.org Wed Apr 9 02:18:41 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 9 Apr 2025 02:18:41 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v4] In-Reply-To: References: Message-ID: > Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. > > In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: > > > const char* CDSConfig::input_static_archive_path(); > const char* CDSConfig::input_dynamic_archive_path(); > const char* CDSConfig::output_archive_path(); > > > This PR also cleans up the code by: > - renaming a few function to reflect what they actually do > - moving more "config" management code into cdsConfig.cpp > > There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. > > However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: - Merge branch 'master' into 8353597-refactor-aot-cache-input-output - @lmesnik comments - more clean up - Minimized changes in ergo_init_classic_archive_paths() - Clean up CDS input/output path handling - Refactored CollectClassesForLinking for simplification - Merge branch 'master' into 8353014-exclude-tooling-classes-from-aot-cache - Reverted some fixes in systemDictionaryShared.cpp that causes test failures - 8353014: Exclude AOT tooling classes from AOT cache ------------- Changes: https://git.openjdk.org/jdk/pull/24401/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24401&range=03 Stats: 309 lines in 15 files changed: 161 ins; 55 del; 93 mod Patch: https://git.openjdk.org/jdk/pull/24401.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24401/head:pull/24401 PR: https://git.openjdk.org/jdk/pull/24401 From fyang at openjdk.org Wed Apr 9 04:02:29 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 9 Apr 2025 04:02:29 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v2] In-Reply-To: <1pa1FDH5Z2quR3fE7o4qfZKwRrz8nXHbMSirSyiqhTw=.9c37d2a9-5b93-40dd-8b5a-a5822030ef48@github.com> References: <1pa1FDH5Z2quR3fE7o4qfZKwRrz8nXHbMSirSyiqhTw=.9c37d2a9-5b93-40dd-8b5a-a5822030ef48@github.com> Message-ID: <9R7U8cL4aSOayHQzaXoTGx0nXSXqdkO4ZomONZnM0Ao=.c1c90e1c-3085-4656-911d-23c407cff74d@github.com> On Fri, 28 Mar 2025 06:53:15 GMT, Robbin Ehn wrote: > It's not some intermittently failure. The majority of them can't work as they use pstack, open core files, use PerfData, etc.. and expected it to be rv64. But core files, pstack are in host arch as we are running qemu-user. I can remove tests which timeouts and only keep test which simply can't work in qemu-user environment in this PR. Seems good? Hi, That make sense to me. And it doesn't seem to me to be riscv-specific issue, but rather one with qemu-user. Maybe we should update the title and changes to reflect that? I sometimes see people testing with qemu for other CPU platforms as well like ppc, s390, etc. Guess they might be helped with this too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2788216103 From asmehra at openjdk.org Wed Apr 9 04:06:30 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Wed, 9 Apr 2025 04:06:30 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v4] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 02:18:41 GMT, Ioi Lam wrote: >> Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. >> >> In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: >> >> >> const char* CDSConfig::input_static_archive_path(); >> const char* CDSConfig::input_dynamic_archive_path(); >> const char* CDSConfig::output_archive_path(); >> >> >> This PR also cleans up the code by: >> - renaming a few function to reflect what they actually do >> - moving more "config" management code into cdsConfig.cpp >> >> There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. >> >> However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - Merge branch 'master' into 8353597-refactor-aot-cache-input-output > - @lmesnik comments > - more clean up > - Minimized changes in ergo_init_classic_archive_paths() > - Clean up CDS input/output path handling > - Refactored CollectClassesForLinking for simplification > - Merge branch 'master' into 8353014-exclude-tooling-classes-from-aot-cache > - Reverted some fixes in systemDictionaryShared.cpp that causes test failures > - 8353014: Exclude AOT tooling classes from AOT cache lgtm ------------- Marked as reviewed by asmehra (Committer). PR Review: https://git.openjdk.org/jdk/pull/24401#pullrequestreview-2751975775 From alanb at openjdk.org Wed Apr 9 05:59:32 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 9 Apr 2025 05:59:32 GMT Subject: RFR: 8351927: Change VirtualThread implementation to use use FJP delayed task handling In-Reply-To: <64xe_dZIfIScyB_InbjJJn19DjAY_7qxmsB7FKKKIH4=.0c4ad76a-cbf8-475d-b3e8-6c2336ec1f29@github.com> References: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> <64xe_dZIfIScyB_InbjJJn19DjAY_7qxmsB7FKKKIH4=.0c4ad76a-cbf8-475d-b3e8-6c2336ec1f29@github.com> Message-ID: On Tue, 8 Apr 2025 20:05:26 GMT, Viktor Klang wrote: >> Follow up to JDK-8319447 to change the VirtualThread implementation to use FJP's delayed task handling. >> >> The SPTE based implementation is not removed. It will continue to be used by tests. If custom schedulers are exposed in the future then they will use this implementation. >> >> For timed-Object.wait, waitTimeoutExpired is changed to use lazySubmit to avoid signalling and increase the chance that the unparked virtual thread will continue on the current carrier. For timed-park, the timeout task is changed to reduced form of unpark that also uses lazySubmit, for the same reason. >> >> `jcmd Thread.vthread_scheduler` is changed to no longer print the delay schedulers. Instead, the delayed task count will appear in the default scheduler output. > > src/java.base/share/classes/java/lang/VirtualThread.java line 889: > >> 887: private void parkTimeoutExpired() { >> 888: assert !VirtualThread.currentThread().isVirtual(); >> 889: if (!getAndSetParkPermit(true) > > @AlanBateman Would it make sense to test whether the park-permit is false before the LOCK XCHG? It already does, no CAS if the current value is the new value. > src/java.base/share/classes/java/lang/VirtualThread.java line 1455: > >> 1453: return pool.schedule(command, delay, unit); >> 1454: } else { >> 1455: return DelayedTaskSchedulers.schedule(command, delay, unit); > > @AlanBateman Would it make sense to test if the Scheduler implements ScheduledExecutorService? Not for now. If a custom scheduler feature is exposed some time then we can think about this topic, it may or may be that the custom scheduler supports delayed tasks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24030#discussion_r2034500634 PR Review Comment: https://git.openjdk.org/jdk/pull/24030#discussion_r2034500533 From rehn at openjdk.org Wed Apr 9 06:34:30 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 9 Apr 2025 06:34:30 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v2] In-Reply-To: <9R7U8cL4aSOayHQzaXoTGx0nXSXqdkO4ZomONZnM0Ao=.c1c90e1c-3085-4656-911d-23c407cff74d@github.com> References: <1pa1FDH5Z2quR3fE7o4qfZKwRrz8nXHbMSirSyiqhTw=.9c37d2a9-5b93-40dd-8b5a-a5822030ef48@github.com> <9R7U8cL4aSOayHQzaXoTGx0nXSXqdkO4ZomONZnM0Ao=.c1c90e1c-3085-4656-911d-23c407cff74d@github.com> Message-ID: On Wed, 9 Apr 2025 03:57:10 GMT, Fei Yang wrote: > > It's not some intermittently failure. The majority of them can't work as they use pstack, open core files, use PerfData, etc.. and expected it to be rv64. But core files, pstack are in host arch as we are running qemu-user. I can remove tests which timeouts and only keep test which simply can't work in qemu-user environment in this PR. Seems good? > > Hi, That make sense to me. And it doesn't seem to me to be riscv-specific issue, but rather one with qemu-user. Maybe we should update the title and changes to reflect that? I sometimes see people testing with qemu for other CPU platforms as well like ppc, s390, etc. Guess they might be helped with this too. Hey, thanks for considering. The default qemu /proc/cpu do not contain any information about this being qemu. And there is no standard way to find this out AFIAK. Some platforms have target specific /proc/cpu and put qemu in there, but it have no standard format. The whole proc -> uarch string -> jvm cpu string -> jtreg require is qemu/linux-user/riscv specific. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2788434073 From stuefe at openjdk.org Wed Apr 9 06:46:35 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 9 Apr 2025 06:46:35 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: <1hCrzMxrXFxiNGtP2T6tvgVhRyuCLRQpX5wsQNwCzNU=.a558b333-1057-4902-97f8-c8509ad85471@github.com> On Tue, 8 Apr 2025 11:43:20 GMT, Kevin Walls wrote: >> This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. >> The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. >> It has always done a manual "root plus pid plus extension" on the default filename only, and >> should move to using Argument::copy_expand_pid() like we do with other such filenames. >> >> >> We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). > > Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: > > - length checking update > - Chris feedback src/hotspot/share/services/heapDumper.cpp line 2760: > 2758: if (dump_file_seq == 0) { // first time in, we initialize base_path > 2759: // Set base path (name or directory, default or custom, without seq no), doing %p substitution. > 2760: const char *path_src = (HeapDumpPath != nullptr && HeapDumpPath[0] != '\0') ? HeapDumpPath : dump_file_name; Why do you expand the dump file name here? If you want to minimize the expand calls, you could: - append the unexpanded dump_file_name to the unexpanded HeapDumpPath - expand - create dir (extract the directory name by temporarily setting the the last '/' to '\0; create dir; restore '/') now you are done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2034576102 From fyang at openjdk.org Wed Apr 9 07:35:40 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 9 Apr 2025 07:35:40 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v2] In-Reply-To: References: <1pa1FDH5Z2quR3fE7o4qfZKwRrz8nXHbMSirSyiqhTw=.9c37d2a9-5b93-40dd-8b5a-a5822030ef48@github.com> <9R7U8cL4aSOayHQzaXoTGx0nXSXqdkO4ZomONZnM0Ao=.c1c90e1c-3085-4656-911d-23c407cff74d@github.com> Message-ID: On Wed, 9 Apr 2025 06:31:55 GMT, Robbin Ehn wrote: > > > It's not some intermittently failure. The majority of them can't work as they use pstack, open core files, use PerfData, etc.. and expected it to be rv64. But core files, pstack are in host arch as we are running qemu-user. I can remove tests which timeouts and only keep test which simply can't work in qemu-user environment in this PR. Seems good? > > > > > > Hi, That make sense to me. And it doesn't seem to me to be riscv-specific issue, but rather one with qemu-user. Maybe we should update the title and changes to reflect that? I sometimes see people testing with qemu for other CPU platforms as well like ppc, s390, etc. Guess they might be helped with this too. > > Hey, thanks for considering. The default qemu /proc/cpu do not contain any information about this being qemu. And there is no standard way to find this out AFIAK. Some platforms have target specific /proc/cpu and put qemu in there, but it have no standard format. The whole proc -> uarch string -> jvm cpu string -> jtreg require is qemu/linux-user/riscv specific. Ah, I see. But I guess it won't bite us if we can't parse `qemu` in /proc/cpuinfo? I am not familiar with how qemu-user works. Can I expect this to work at least for some other CPUs supported by the JVM? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2788626578 From vklang at openjdk.org Wed Apr 9 08:06:33 2025 From: vklang at openjdk.org (Viktor Klang) Date: Wed, 9 Apr 2025 08:06:33 GMT Subject: RFR: 8351927: Change VirtualThread implementation to use use FJP delayed task handling In-Reply-To: References: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> <64xe_dZIfIScyB_InbjJJn19DjAY_7qxmsB7FKKKIH4=.0c4ad76a-cbf8-475d-b3e8-6c2336ec1f29@github.com> Message-ID: On Wed, 9 Apr 2025 05:56:28 GMT, Alan Bateman wrote: >> src/java.base/share/classes/java/lang/VirtualThread.java line 889: >> >>> 887: private void parkTimeoutExpired() { >>> 888: assert !VirtualThread.currentThread().isVirtual(); >>> 889: if (!getAndSetParkPermit(true) >> >> @AlanBateman Would it make sense to test whether the park-permit is false before the LOCK XCHG? > > It already does, no CAS if the current value is the new value. I meant the park permit. :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24030#discussion_r2034736102 From rehn at openjdk.org Wed Apr 9 08:09:42 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 9 Apr 2025 08:09:42 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v2] In-Reply-To: References: <1pa1FDH5Z2quR3fE7o4qfZKwRrz8nXHbMSirSyiqhTw=.9c37d2a9-5b93-40dd-8b5a-a5822030ef48@github.com> <9R7U8cL4aSOayHQzaXoTGx0nXSXqdkO4ZomONZnM0Ao=.c1c90e1c-3085-4656-911d-23c407cff74d@github.com> Message-ID: On Wed, 9 Apr 2025 07:29:01 GMT, Fei Yang wrote: > > > > It's not some intermittently failure. The majority of them can't work as they use pstack, open core files, use PerfData, etc.. and expected it to be rv64. But core files, pstack are in host arch as we are running qemu-user. I can remove tests which timeouts and only keep test which simply can't work in qemu-user environment in this PR. Seems good? > > > > > > > > > Hi, That make sense to me. And it doesn't seem to me to be riscv-specific issue, but rather one with qemu-user. Maybe we should update the title and changes to reflect that? I sometimes see people testing with qemu for other CPU platforms as well like ppc, s390, etc. Guess they might be helped with this too. > > > > > > Hey, thanks for considering. The default qemu /proc/cpu do not contain any information about this being qemu. And there is no standard way to find this out AFIAK. Some platforms have target specific /proc/cpu and put qemu in there, but it have no standard format. The whole proc -> uarch string -> jvm cpu string -> jtreg require is qemu/linux-user/riscv specific. > > Ah, I see. But I guess it won't bite us if we can't parse `qemu` in /proc/cpuinfo? I am not familiar with how qemu-user works. Can I expect this to work at least for some other CPUs supported by the JVM? qemu-user, "uarch: qemu" in cpuinfo: `[0.084s][info ][os,cpu] CPU: total 28 (initial active 28) qemu rv64 rvi rvm rva rvf rvd rvc rvv zba zbb zbs zfh zfhmin zvbc zvfh zicond` Hence we know this is qemu-user (only qemu-user sets uarch to qemu on riscv). `/proc/cpuinfo` do not contain uarch: [0.053s][info ][os,cpu] CPU: total 8 (initial active 8) rv64 rvi rvm rva rvf rvd rvc zba zbb zbs zfh zfhmin zvfh zicond We have no clue if this is emulated or on real hardware, tests will be executed. Tests are only excluded if we know it's qemu-user. Did that anwser your Q ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2788717608 From alanb at openjdk.org Wed Apr 9 08:19:36 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 9 Apr 2025 08:19:36 GMT Subject: RFR: 8351927: Change VirtualThread implementation to use use FJP delayed task handling In-Reply-To: References: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> <64xe_dZIfIScyB_InbjJJn19DjAY_7qxmsB7FKKKIH4=.0c4ad76a-cbf8-475d-b3e8-6c2336ec1f29@github.com> Message-ID: On Wed, 9 Apr 2025 08:03:44 GMT, Viktor Klang wrote: >> It already does, no CAS if the current value is the new value. > > I meant the park permit. :) getAndSetParkPermit isn't changed in this PR. If the park permit is already granted then getAndSetParkPermit(true) doesn't do anything. More generally, parkTimeoutExpired is just a reduced unpark, only added so that we can do a lazy rather than internal submit for the less common case that the timeout expires. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24030#discussion_r2034760820 From sspitsyn at openjdk.org Wed Apr 9 08:20:34 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 9 Apr 2025 08:20:34 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls Message-ID: As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. Some specific implementation details can be added to the first PR comment. Testing: - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): - the assert described above is fired if the fix of JDK-8352088 is removed - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed - Ran mach5 tiers 1-6 ------------- Commit messages: - 8352773: JVMTI should disable events during java upcalls Changes: https://git.openjdk.org/jdk/pull/24539/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24539&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352773 Stats: 32 lines in 6 files changed: 30 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24539/head:pull/24539 PR: https://git.openjdk.org/jdk/pull/24539 From alanb at openjdk.org Wed Apr 9 08:41:24 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 9 Apr 2025 08:41:24 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 08:14:04 GMT, Serguei Spitsyn wrote: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 If this goes ahead then it allows for a discussion about changing JVMTI InterruptThread to invoke Thread.interrupt when the target is a platform thread. As you know, there is a long standing issue here where threads blocked on interruptible channels not being awakened by JVMTI InterruptThread. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24539#issuecomment-2788806930 From vklang at openjdk.org Wed Apr 9 08:46:36 2025 From: vklang at openjdk.org (Viktor Klang) Date: Wed, 9 Apr 2025 08:46:36 GMT Subject: RFR: 8351927: Change VirtualThread implementation to use use FJP delayed task handling In-Reply-To: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> References: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> Message-ID: On Thu, 13 Mar 2025 10:48:14 GMT, Alan Bateman wrote: > Follow up to JDK-8319447 to change the VirtualThread implementation to use FJP's delayed task handling. > > The SPTE based implementation is not removed. It will continue to be used by tests. If custom schedulers are exposed in the future then they will use this implementation. > > For timed-Object.wait, waitTimeoutExpired is changed to use lazySubmit to avoid signalling and increase the chance that the unparked virtual thread will continue on the current carrier. For timed-park, the timeout task is changed to reduced form of unpark that also uses lazySubmit, for the same reason. > > `jcmd Thread.vthread_scheduler` is changed to no longer print the delay schedulers. Instead, the delayed task count will appear in the default scheduler output. Marked as reviewed by vklang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24030#pullrequestreview-2752606260 From kevinw at openjdk.org Wed Apr 9 08:52:31 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 9 Apr 2025 08:52:31 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: <1hCrzMxrXFxiNGtP2T6tvgVhRyuCLRQpX5wsQNwCzNU=.a558b333-1057-4902-97f8-c8509ad85471@github.com> References: <1hCrzMxrXFxiNGtP2T6tvgVhRyuCLRQpX5wsQNwCzNU=.a558b333-1057-4902-97f8-c8509ad85471@github.com> Message-ID: On Wed, 9 Apr 2025 06:43:20 GMT, Thomas Stuefe wrote: >> Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: >> >> - length checking update >> - Chris feedback > > src/hotspot/share/services/heapDumper.cpp line 2760: > >> 2758: if (dump_file_seq == 0) { // first time in, we initialize base_path >> 2759: // Set base path (name or directory, default or custom, without seq no), doing %p substitution. >> 2760: const char *path_src = (HeapDumpPath != nullptr && HeapDumpPath[0] != '\0') ? HeapDumpPath : dump_file_name; > > Why do you expand the dump file name here? > > If you want to minimize the expand calls, you could: > > - append the unexpanded dump_file_name to the unexpanded HeapDumpPath > - expand > - create dir (extract the directory name by temporarily setting the the last '/' to '\0; create dir; restore '/') > now you are done. It concerned me that we presume there is no %p in the directory/path. It might be tricky (get the pid, create a directory, before the first heap dump), but it's possible. Trying to use an existing directory with a literal %p in it will fail, but we can't have this both ways, and should either honour the %p substitution fully, or not. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2034820138 From kevinw at openjdk.org Wed Apr 9 09:34:02 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 9 Apr 2025 09:34:02 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v3] In-Reply-To: References: Message-ID: > This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. > The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. > It has always done a manual "root plus pid plus extension" on the default filename only, and > should move to using Argument::copy_expand_pid() like we do with other such filenames. > > > We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: test feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24482/files - new: https://git.openjdk.org/jdk/pull/24482/files/c32e4ca4..40c67a0c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24482&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24482&range=01-02 Stats: 24 lines in 1 file changed: 1 ins; 17 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24482.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24482/head:pull/24482 PR: https://git.openjdk.org/jdk/pull/24482 From kevinw at openjdk.org Wed Apr 9 09:34:02 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 9 Apr 2025 09:34:02 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v2] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 00:00:28 GMT, Leonid Mesnik wrote: >> Kevin Walls has updated the pull request incrementally with two additional commits since the last revision: >> >> - length checking update >> - Chris feedback > > test/hotspot/jtreg/runtime/ErrorHandling/TestHeapDumpOnOutOfMemoryError.java line 100: > >> 98: output.shouldContain("Dumping heap to " + type + ".hprof"); >> 99: File dump = new File(heapdumpFilename); >> 100: Asserts.assertTrue(dump.exists() && dump.isFile(), "Could not find dump file " + dump.getAbsolutePath()); > > I. think you could just update the test to use > heapdumpFilename = type + ".%p.hprof"; > we don't need test twice, it is quite expensive. sure, it's a few seconds but we can save that time 8-) > test/hotspot/jtreg/runtime/ErrorHandling/TestHeapDumpOnOutOfMemoryError.java line 115: > >> 113: output.stdoutShouldNotBeEmpty(); >> 114: String actualHeapdumpFilename = type + "." + output.pid() + ".hprof"; >> 115: output.shouldContain("Dumping heap to " + actualHeapdumpFilename); > > This better to be something like expectedlHeapdumpFilename and > "Expected heap dump file". > > Not very important, but make log cleaner. sure, done! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2034931003 PR Review Comment: https://git.openjdk.org/jdk/pull/24482#discussion_r2034931425 From sspitsyn at openjdk.org Wed Apr 9 09:56:29 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 9 Apr 2025 09:56:29 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 08:39:14 GMT, Alan Bateman wrote: > If this goes ahead then it allows for a discussion about changing JVMTI InterruptThread to invoke Thread.interrupt when the target is a platform thread. As you know, there is a long standing issue here where threads blocked on interruptible channels not being awakened by JVMTI InterruptThread. Yes, this fix is already including an update for interrupts. I'll add a comment with the fix details tomorrow. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24539#issuecomment-2789083462 From ihse at openjdk.org Wed Apr 9 10:31:39 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 9 Apr 2025 10:31:39 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Tue, 11 Feb 2025 15:56:39 GMT, Matthias Baesken wrote: > The libjdwp is currently built with LOW optimization level, it could be built with SIZE optimization to lower the lib size by ~ 10 % on UNIX. > On Windows LOW and SIZE currently translate to the same O1 optimization flag so no difference there. > > On Linux x86_64 for example the lib shrinks from > 300K to 268K and the debuginfo file shrinks from 1.9M to 1.7M . > > On Linux ppc64le for example the lib shrinks from > 428K to 368K and the debuginfo file shrinks from 2.0M to 1.7M . Wait, `LOW` is `-O2`? ? I thought it was like no optimization at all. I'm sooo confused with these levels. So maybe going from `LOW` to `SIZE` will actually lose more optimization than I thought. *sigh* ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2789200299 From mbaesken at openjdk.org Wed Apr 9 10:44:39 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 9 Apr 2025 10:44:39 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Tue, 11 Feb 2025 15:56:39 GMT, Matthias Baesken wrote: > The libjdwp is currently built with LOW optimization level, it could be built with SIZE optimization to lower the lib size by ~ 10 % on UNIX. > On Windows LOW and SIZE currently translate to the same O1 optimization flag so no difference there. > > On Linux x86_64 for example the lib shrinks from > 300K to 268K and the debuginfo file shrinks from 1.9M to 1.7M . > > On Linux ppc64le for example the lib shrinks from > 428K to 368K and the debuginfo file shrinks from 2.0M to 1.7M . > Wait, `LOW` is `-O2`? ? I thought it was like no optimization at all. I'm sooo confused with these levels. So maybe going from `LOW` to `SIZE` will actually lose more optimization than I thought. _sigh_ Yes see https://github.com/openjdk/jdk/blob/master/make/autoconf/flags-cflags.m4#L310 on gcc/clang `C_O_FLAG_NORM="-O2"` and for LOW optimization level of a lib, we use the FLAG_NORM , see https://github.com/openjdk/jdk/blob/master/make/common/native/Flags.gmk#L46 else ifeq ($$($1_OPTIMIZATION), LOW) $1_OPT_CFLAGS := $(C_O_FLAG_NORM) $1_OPT_CXXFLAGS := $(CXX_O_FLAG_NORM) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2789230263 From stuefe at openjdk.org Wed Apr 9 10:45:32 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 9 Apr 2025 10:45:32 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v3] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 09:34:02 GMT, Kevin Walls wrote: >> This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. >> The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. >> It has always done a manual "root plus pid plus extension" on the default filename only, and >> should move to using Argument::copy_expand_pid() like we do with other such filenames. >> >> >> We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > test feedback Looks good to me. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24482#pullrequestreview-2753028819 From alanb at openjdk.org Wed Apr 9 11:03:33 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 9 Apr 2025 11:03:33 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK [v2] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 19:09:02 GMT, Jiangli Zhou wrote: >> Please review this PR that changes `LoadAgentDcmdTest.getLibInstrumentPath()` to not locate `libinstrument` shared library if running on static JDK, instead just returns "libinstrument." directly. Both test case #1 and #2 in `LoadAgentDcmdTest.run()` run ok on static JDK with the change: >> >> Commands: >> Test case #1: `JVMTI.agent_load libinstrument.so agent.jar` >> Test case #2: `JVMTI.agent_load libinstrument.so "agent.jar=foo=bar"` >> >> Additional notes/considerations: >> >> 1. I notice [JDKToolFinder.getJDKTool()](https://github.com/openjdk/jdk/blob/80ff7b9c9406c7845ecb3bc40910e92ccdd23ff2/test/lib/jdk/test/lib/JDKToolFinder.java#L41) is used by the test to find the `jcmd` tool. `JDKToolFinder.getJDKTool()` looks for the requested tool in both `test.jdk` and `compile.jdk`. When running the jtreg test on static JDK, it's able to locate the `jcmd` from `compile.jdk` even though the static JDK binary (`test.jdk`) does not provide the tool. For tools used by jtreg tests at runtime, how about change to always do that? >> >> 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: >> >> JVMTI.agent_load [arguments] >> Loads JVMTI native agent. >> Impact: Low >> arguments: >> library path: Absolute path of the JVMTI agent to load. (STRING, no default value) >> agent option: (Optional) Option string to pass the agent. (STRING, no default value) >> >> The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. > > Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: > > Skip LoadAgentDcmdTest on static JDK. I assume you've bump the copyright header before integrating. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24497#pullrequestreview-2753076032 From kevinw at openjdk.org Wed Apr 9 11:04:26 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 9 Apr 2025 11:04:26 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v4] In-Reply-To: References: Message-ID: > This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. > The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. > It has always done a manual "root plus pid plus extension" on the default filename only, and > should move to using Argument::copy_expand_pid() like we do with other such filenames. > > > We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). Kevin Walls has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into HeapDumpPath_pid - test feedback - length checking update - Chris feedback - 8353727: HeapDumpPath doesn't expand %p ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24482/files - new: https://git.openjdk.org/jdk/pull/24482/files/40c67a0c..09f66542 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24482&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24482&range=02-03 Stats: 11976 lines in 335 files changed: 8329 ins; 2524 del; 1123 mod Patch: https://git.openjdk.org/jdk/pull/24482.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24482/head:pull/24482 PR: https://git.openjdk.org/jdk/pull/24482 From ihse at openjdk.org Wed Apr 9 12:29:40 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 9 Apr 2025 12:29:40 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Tue, 11 Feb 2025 15:56:39 GMT, Matthias Baesken wrote: > The libjdwp is currently built with LOW optimization level, it could be built with SIZE optimization to lower the lib size by ~ 10 % on UNIX. > On Windows LOW and SIZE currently translate to the same O1 optimization flag so no difference there. > > On Linux x86_64 for example the lib shrinks from > 300K to 268K and the debuginfo file shrinks from 1.9M to 1.7M . > > On Linux ppc64le for example the lib shrinks from > 428K to 368K and the debuginfo file shrinks from 2.0M to 1.7M . O.M.G. This is sooo messed up. I am thinking about making a special build where I set everything to -O3 and another where I set everything to -Os and just see what difference it makes in size and performance tests, and then suggest just going for one of them at the general rule (with exceptions for individual libraries, if needed). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2789534753 From ihse at openjdk.org Wed Apr 9 12:33:35 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 9 Apr 2025 12:33:35 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Tue, 11 Feb 2025 15:56:39 GMT, Matthias Baesken wrote: > The libjdwp is currently built with LOW optimization level, it could be built with SIZE optimization to lower the lib size by ~ 10 % on UNIX. > On Windows LOW and SIZE currently translate to the same O1 optimization flag so no difference there. > > On Linux x86_64 for example the lib shrinks from > 300K to 268K and the debuginfo file shrinks from 1.9M to 1.7M . > > On Linux ppc64le for example the lib shrinks from > 428K to 368K and the debuginfo file shrinks from 2.0M to 1.7M . We also really must skip the dichotomy between `C_O_FLAG` and `OPTIMIZATION`, it just makes no sense whatsoever. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2789545415 From ihse at openjdk.org Wed Apr 9 12:36:37 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 9 Apr 2025 12:36:37 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Tue, 11 Feb 2025 15:56:39 GMT, Matthias Baesken wrote: > The libjdwp is currently built with LOW optimization level, it could be built with SIZE optimization to lower the lib size by ~ 10 % on UNIX. > On Windows LOW and SIZE currently translate to the same O1 optimization flag so no difference there. > > On Linux x86_64 for example the lib shrinks from > 300K to 268K and the debuginfo file shrinks from 1.9M to 1.7M . > > On Linux ppc64le for example the lib shrinks from > 428K to 368K and the debuginfo file shrinks from 2.0M to 1.7M . I'd like to move to a world where we basically have just two optimization levels, "size" and "speed", and libraries do not in general have the level specified, so it falls back on a default, which could then be set by configure. For individual libraries we might need to override the default value, if we know that certain compilers make a mess of certain optimization levels, or if some libraries are especially performance sensitive. (Making hotspot `-Os` would certainly never make any sense, for example.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2789552092 From alanb at openjdk.org Wed Apr 9 12:39:48 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 9 Apr 2025 12:39:48 GMT Subject: Integrated: 8351927: Change VirtualThread implementation to use use FJP delayed task handling In-Reply-To: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> References: <7vE97S4zy2S1vuRwEapE4k9ZScS6yTJBQWyKymjdc0g=.5551d6d6-68ad-446e-84c7-70fd57fa57a6@github.com> Message-ID: On Thu, 13 Mar 2025 10:48:14 GMT, Alan Bateman wrote: > Follow up to JDK-8319447 to change the VirtualThread implementation to use FJP's delayed task handling. > > The SPTE based implementation is not removed. It will continue to be used by tests. If custom schedulers are exposed in the future then they will use this implementation. > > For timed-Object.wait, waitTimeoutExpired is changed to use lazySubmit to avoid signalling and increase the chance that the unparked virtual thread will continue on the current carrier. For timed-park, the timeout task is changed to reduced form of unpark that also uses lazySubmit, for the same reason. > > `jcmd Thread.vthread_scheduler` is changed to no longer print the delay schedulers. Instead, the delayed task count will appear in the default scheduler output. This pull request has now been integrated. Changeset: 6c93ad42 Author: Alan Bateman URL: https://git.openjdk.org/jdk/commit/6c93ad42f38b49ea96155340c4b6bbedfcef2a90 Stats: 405 lines in 8 files changed: 332 ins; 41 del; 32 mod 8351927: Change VirtualThread implementation to use use FJP delayed task handling Reviewed-by: vklang ------------- PR: https://git.openjdk.org/jdk/pull/24030 From duke at openjdk.org Wed Apr 9 13:24:33 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 9 Apr 2025 13:24:33 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v4] In-Reply-To: References: Message-ID: <5MzHrfRc-pu8J5vQdmeeuk4HxP7uumhhVfELVs_VRwU=.4d86b907-a8e8-4bbd-98cb-892491b5c2ad@github.com> > ### Update: > After some discussion it was decided it's not necessary to expand the lock scope for reserve/commit. Instead, we are opting to add comments explaining the reasons for locking and the conditions to avoid which could lead to races. Some of the new tests can be kept because they are general enough to be useful outside of this context. > > ### Summary: > This PR makes memory operations atomic with NMT accounting. > > ### The problem: > In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. > > 1.1 Thread_1 releases range_A. > 1.2 Thread_1 tells NMT "range_A has been released". > > 2.1 Thread_2 reserves (the now free) range_A. > 2.2 Thread_2 tells NMT "range_A is reserved". > > Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. > > ### Solution: > Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. > > ### Other notes: > I also simplified this pattern found in many places: > > if (MemTracker::enabled()) { > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_some_operation(addr, bytes); > if (result != nullptr) { > MemTracker::record_some_operation(addr, bytes); > } > } else { > result = pd_unmap_memory(addr, bytes); > } > ``` > To: > > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_unmap_memory(addr, bytes); > MemTracker::record_some_operation(addr, bytes); > ``` > This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. > > I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section... Robert Toyonaga has updated the pull request incrementally with four additional commits since the last revision: - Update test/hotspot/gtest/runtime/test_os.cpp Co-authored-by: Stefan Karlsson - Update test/hotspot/gtest/runtime/test_os.cpp Co-authored-by: Stefan Karlsson - Update test/hotspot/gtest/runtime/test_os.cpp Co-authored-by: Stefan Karlsson - Update test/hotspot/gtest/runtime/test_os.cpp Co-authored-by: Stefan Karlsson ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24084/files - new: https://git.openjdk.org/jdk/pull/24084/files/5c23a76a..813a1e49 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=02-03 Stats: 5 lines in 1 file changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24084.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24084/head:pull/24084 PR: https://git.openjdk.org/jdk/pull/24084 From duke at openjdk.org Wed Apr 9 13:24:35 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 9 Apr 2025 13:24:35 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v3] In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 14:08:23 GMT, Stefan Karlsson wrote: >> Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: >> >> exclude file mapping tests on AIX. > > src/hotspot/share/runtime/os.cpp line 2206: > >> 2204: // when it is actually committed. The opposite scenario is not guarded against. pd_commit_memory and >> 2205: // record_virtual_memory_commit do not happen atomically. We assume that there is some external synchronization >> 2206: // that prevents a region from being uncommitted before it is finished being committed. > > It's not a requirement, but you get kudos from me if you keep comments lines below 80 lines. I typically don't like code to be 80 lines, but comments tend to be nicer if they are. Ok I'll try to shorten these comments a bit ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2035360006 From duke at openjdk.org Wed Apr 9 13:43:01 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 9 Apr 2025 13:43:01 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v5] In-Reply-To: References: Message-ID: > ### Update: > After some discussion it was decided it's not necessary to expand the lock scope for reserve/commit. Instead, we are opting to add comments explaining the reasons for locking and the conditions to avoid which could lead to races. Some of the new tests can be kept because they are general enough to be useful outside of this context. > > ### Summary: > This PR makes memory operations atomic with NMT accounting. > > ### The problem: > In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. > > 1.1 Thread_1 releases range_A. > 1.2 Thread_1 tells NMT "range_A has been released". > > 2.1 Thread_2 reserves (the now free) range_A. > 2.2 Thread_2 tells NMT "range_A is reserved". > > Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. > > ### Solution: > Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. > > ### Other notes: > I also simplified this pattern found in many places: > > if (MemTracker::enabled()) { > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_some_operation(addr, bytes); > if (result != nullptr) { > MemTracker::record_some_operation(addr, bytes); > } > } else { > result = pd_unmap_memory(addr, bytes); > } > ``` > To: > > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_unmap_memory(addr, bytes); > MemTracker::record_some_operation(addr, bytes); > ``` > This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. > > I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section... Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: improve tests and comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24084/files - new: https://git.openjdk.org/jdk/pull/24084/files/813a1e49..7b7263b2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24084&range=03-04 Stats: 23 lines in 2 files changed: 1 ins; 8 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/24084.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24084/head:pull/24084 PR: https://git.openjdk.org/jdk/pull/24084 From jsikstro at openjdk.org Wed Apr 9 13:56:53 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Wed, 9 Apr 2025 13:56:53 GMT Subject: RFR: 8350441: ZGC: Overhaul Page Allocation Message-ID: > Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. # Background This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. # Why a Mapped Cache? The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. # Fragmentation Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. ## Virtual Memory Shuffling In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with virtual memory. When harvesting memory, which needs to be remapped, new contiguous virtual memory must first be claimed. We have now added a feature in which the harvested memory can be re-used to improve the likelihood of finding a contiguous range. Additionally, we have re-designed the defragmentation policy so that Large pages are always defragmented upon being freed. When freed, they are broken down and remapped into lower address space, in the hopes of "filling holes" and creating more contiguous ranges. # NUMA and Partitions In the current policy, ZGC interleaves memory across all NUMA nodes with a granularity of ZGranuleSize (2MB), which is the same size as a Small page. As a result, Small pages will end up on a single, preferably local, NUMA node, whilst larger allocations will (likely) end up on multiple NUMA nodes. In the new design, the policy is to prefer allocating *all* allocation sizes to the local NUMA node whenever possible. As an effect, ZGC may be able to extract better performance from NUMA systems. To support local NUMA allocations, the Page Allocator, and in turn the Java heap, has been split up into what we refer to as Partitions. A partition keeps track of its own heap size and Mapped Cache, allowing it to only handle memory that is associated with its own share of the heap. The number of partitions is currently the same as the number of NUMA nodes. On non-NUMA systems, only a single partition is kept track of. The introduction of partitions also establishes a foundation for more fine-grained control over the heap, paving the way for future enhancements, both NUMA possibilities and new features, such as Thread-Local GC. # Defragmentation (Unmapping Memory) Up until now, ZGC has unmapped memory asynchronously in a separate thread. The benefit of this is that other threads do not need to take a latency hit when unmapping memory. The main dependency on asynchronous unmapping is when harvesting, especially from a mutator thread, where synchronous unmapping could lead to unwanted latency. With the introduction of the Mapped Cache, and by moving defragmentation away from mutator threads to the GC, asynchronous unmapping is no longer necessary to meet our latency goals. Instead, memory is now unmapped synchronously. The number of times memory is defragmented for page allocations has been reduced significantly. However, memory for Small pages never needs to be defragmented at all. For Large pages, memory defragmentation has little effect on the total latency, as they are costly to allocate anyways. For Medium pages, we have plans for future enhancements where memory is defragmented even less, or not at all. For clarity: with the removal of asynchronous unmapping, we have removed the ZUnmapper thread and ZUnmap JFR event. # Multi-Mapped Memory Asynchronous unmapping has so far been possible because ZGC is backed by shared memory (on Linux), which allows memory to be multi-mapped. This is an artifact from non-generational ZGC, which used multi-mapping in its core design (See [this](https://wiki.openjdk.org/display/zgc/Pointer+Metadata+using+Multi-Mapped+memory) resource for more info). A goal we have in ZGC is to move from shared memory to anonymous memory. There are multiple benefits with anonymous memory, one of them being easier configuration for Transparent Huge Pages (OS pages). Anonymous memory doesn't support multi-mapped memory, and would be blocked by the asynchronous unmapping feature. However, with the removal of asynchronous unmapping, we are now better prepared for transitioning to anonymous memory. # Additional Notes This RFE comes with our own implementation of a red-black tree for the Mapped Cache. Another red-black tree was recently introduced by C. Norrbin in [JDK-8345314](https://bugs.openjdk.org/browse/JDK-8345314) (and enhanced in [JDK-8349211](https://bugs.openjdk.org/browse/JDK-8349211)). Our goal is to initially integrate with our implementation, but remove our implementation in favor of Norrbin's tree in a future RFE. The reason we have our own tree implementation is because Norrbin's tree was not finished during the time we were developing and testing this RFE. Some new additions have been made to keep the current functionality in the Serviceability Agent (SA). # Testing * Oracle's tiers 1-8 * We have added a small set of new tests, both gtests and jtreg tests, to test new functionality # Performance * Improvements in tail latency in SPECjbb2015. * Improvements when using small OS pages in combination with NUMA. * Small increase in the time it takes to run a GC. This is because some work has been moved from mutator threads to only be done in GC threads. This should not affect the total run-time of a program as the total work remains the same, but mutator latency is improved. * Other suitable benchmarks show no significant improvements or regressions. ------------- Commit messages: - Whitespace fix in zunittest.hpp - Copyright years - 8350441: ZGC: Overhaul Page Allocation Changes: https://git.openjdk.org/jdk/pull/24547/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24547&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8350441 Stats: 12052 lines in 118 files changed: 7936 ins; 3218 del; 898 mod Patch: https://git.openjdk.org/jdk/pull/24547.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24547/head:pull/24547 PR: https://git.openjdk.org/jdk/pull/24547 From kvn at openjdk.org Wed Apr 9 14:01:44 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 9 Apr 2025 14:01:44 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v4] In-Reply-To: References: Message-ID: <9g5m1aRjX_bOUqojs3da5pKPapimb1lnYxqdA92l8jk=.03e0aa78-bc7b-419a-814a-3104f300bf13@github.com> On Wed, 9 Apr 2025 02:18:41 GMT, Ioi Lam wrote: >> Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. >> >> In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: >> >> >> const char* CDSConfig::input_static_archive_path(); >> const char* CDSConfig::input_dynamic_archive_path(); >> const char* CDSConfig::output_archive_path(); >> >> >> This PR also cleans up the code by: >> - renaming a few function to reflect what they actually do >> - moving more "config" management code into cdsConfig.cpp >> >> There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. >> >> However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits: > > - Merge branch 'master' into 8353597-refactor-aot-cache-input-output > - @lmesnik comments > - more clean up > - Minimized changes in ergo_init_classic_archive_paths() > - Clean up CDS input/output path handling > - Refactored CollectClassesForLinking for simplification > - Merge branch 'master' into 8353014-exclude-tooling-classes-from-aot-cache > - Reverted some fixes in systemDictionaryShared.cpp that causes test failures > - 8353014: Exclude AOT tooling classes from AOT cache Re-approved. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24401#pullrequestreview-2753597060 From mbaesken at openjdk.org Wed Apr 9 14:20:40 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 9 Apr 2025 14:20:40 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 12:33:26 GMT, Magnus Ihse Bursie wrote: > Making hotspot -Os would certainly never make any sense, for example We even have this (jvm feature opt-size) https://github.com/openjdk/jdk/blob/master/make/hotspot/lib/JvmFeatures.gmk#L197 but it is not widely used afaik. I think in the past it was used for certain small deployments (and a few very performance sensitive files , see the list from the link, still use 'speed optimization' ). ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2789899406 From kevinw at openjdk.org Wed Apr 9 14:51:48 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 9 Apr 2025 14:51:48 GMT Subject: RFR: 8353727: HeapDumpPath doesn't expand %p [v4] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 11:04:26 GMT, Kevin Walls wrote: >> This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. >> The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. >> It has always done a manual "root plus pid plus extension" on the default filename only, and >> should move to using Argument::copy_expand_pid() like we do with other such filenames. >> >> >> We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). > > Kevin Walls has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into HeapDumpPath_pid > - test feedback > - length checking update > - Chris feedback > - 8353727: HeapDumpPath doesn't expand %p Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24482#issuecomment-2789998024 From kevinw at openjdk.org Wed Apr 9 14:51:48 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 9 Apr 2025 14:51:48 GMT Subject: Integrated: 8353727: HeapDumpPath doesn't expand %p In-Reply-To: References: Message-ID: On Mon, 7 Apr 2025 09:05:34 GMT, Kevin Walls wrote: > This is a long-standing oversight: HeapDumpPath does not recognise %p for pid expansion. > The default filename uses a pid (e.g. java_pid1676937.hprof) but HeapDumpPath does not. > It has always done a manual "root plus pid plus extension" on the default filename only, and > should move to using Argument::copy_expand_pid() like we do with other such filenames. > > > We also assumed the default filename is not a directory (which is very very likely, but doesn't have to be true). This pull request has now been integrated. Changeset: 7a7b9ed7 Author: Kevin Walls URL: https://git.openjdk.org/jdk/commit/7a7b9ed7fe4a10bca155b0877c3e731f9d343b92 Stats: 71 lines in 2 files changed: 10 ins; 37 del; 24 mod 8353727: HeapDumpPath doesn't expand %p Reviewed-by: stuefe, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/24482 From iklam at openjdk.org Wed Apr 9 15:03:44 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 9 Apr 2025 15:03:44 GMT Subject: RFR: 8353597: Refactor handling VM options for AOT cache input and output [v2] In-Reply-To: References: Message-ID: On Thu, 3 Apr 2025 22:02:01 GMT, Vladimir Kozlov wrote: >>> @iklam one annoying thing in current ergonomic setting for AOTCode flags in mainline is checking which phase we are executing. We agreed before that we should only save/load AOT code when `AOTClassLinking` is on because AOT code needs classes to be preloaded. >>> >>> I have to do next checks to enable AOTCode in `CDSConfig::check_vm_args_consistency()`: >>> >>> ``` >>> if (AOTClassLinking && is_using_archive() && !is_dumping_archive() && !FLAG_IS_DEFAULT(AOTCache)) { >>> FLAG_SET_ERGO_IF_DEFAULT(LoadAOTCode, true); >>> ... >>> if (AOTClassLinking && is_dumping_final_static_archive()) { >>> FLAG_SET_ERGO_IF_DEFAULT(StoreAOTCode, true); >>> ``` >>> >>> First, I am not sure these conditions are correct. >>> >>> Second, it would be nice to have simple checks instead: `is_dumping_aot_archive()` and `is_using_aot_archive()`. >>> >>> May be also consider it is error if both conditions are true (we don't support updating archive yet). >> >> There are a lot of dependencies between different AOT capabilities, and it's hard to control that using global variables. At the point of `CDSConfig::check_vm_args_consistency()`, we don't have complete knowledge whether the AOT cache exists, or whether the cache contains AOT code, or whether the GC compressed oops settings are compatible with the AOT code. >> >> In the handling of such "AOT capability flags", I have been using the following pattern: >> >> In `CDSConfig::check_vm_args_consistency()` we update the default values of the flags according to their dependencies on other flags. E.g., by specifying `-XX:AOTMode=create`, `AOTClassLinking` and `AOTInvokeDynamicLinking` are enabled by default. >> >> >> if (!FLAG_IS_DEFAULT(AOTMode)) { >> // Using any form of the new AOTMode switch enables enhanced optimizations. >> FLAG_SET_ERGO_IF_DEFAULT(AOTClassLinking, true); >> } >> >> if (AOTClassLinking) { >> // If AOTClassLinking is specified, enable all AOT optimizations by default. >> FLAG_SET_ERGO_IF_DEFAULT(AOTInvokeDynamicLinking, true); >> } else { >> // AOTInvokeDynamicLinking depends on AOTClassLinking. >> FLAG_SET_ERGO(AOTInvokeDynamicLinking, false); >> } >> >> >> However, the values of these flags are just advisory. Even if a flag is enabled, the underlying capability may be disabled. For example, `AOTClassLinking` requires the ability of dumping heap objects, which is not available if ZGC is used. >> >> Because the dependencies are complex, it's difficult to resolve them statically and set a globa... > > Thank you @iklam for explanation. I can do final adjustment to `Store|LoadAOTCode` flags values in `StoreAOTCode::initialize()` which is called from `initialize_shared_spaces()`: > > > MetaspaceShared::initialize_shared_spaces() { > ... > static_mapinfo->patch_heap_embedded_pointers(); > ArchiveHeapLoader::finish_initialization(); > Universe::load_archived_object_instances(); > + AOTCodeCache::initialize(); > > > The question: at this place are all CDS AOT flags are final (flags compatibility and cache presence are verified)? > > Note, `Store|LoadAOTCode` flags are diagnostic and disabled by default. I need to set them to `true` somewhere. Thanks @vnkozlov @ashu-mehra @DanHeidinga @lmesnik for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/24401#issuecomment-2790037592 From iklam at openjdk.org Wed Apr 9 15:07:14 2025 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 9 Apr 2025 15:07:14 GMT Subject: Integrated: 8353597: Refactor handling VM options for AOT cache input and output In-Reply-To: References: Message-ID: <5DDGpghIYxMiMRk8aXwnHqcBgepkaDIZJT0gxtje8JI=.0aad39c1-5513-4638-99f2-8a96f447595c@github.com> On Thu, 3 Apr 2025 04:00:59 GMT, Ioi Lam wrote: > Since [JEP 483: Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483), VM options such as `-XX:AOTCache `are implemented as aliases of "classical" CDS options such as `-XX:SharedArchiveFile`. > > In anticipation of the [JEP: Ahead-of-time Command Line Ergonomics](https://bugs.openjdk.org/browse/JDK-8350022), we should refactor the code that deals with the AOT options. Specifically, as we expect the JVM to be able to load from an "input AOT cache" and write to an "output AOT cache", we should clearly identify the input and output caches in separate APIs: > > > const char* CDSConfig::input_static_archive_path(); > const char* CDSConfig::input_dynamic_archive_path(); > const char* CDSConfig::output_archive_path(); > > > This PR also cleans up the code by: > - renaming a few function to reflect what they actually do > - moving more "config" management code into cdsConfig.cpp > > There's also a behavioral bug fix: before this PR, `-XX:AOTCache` was handled by the `ergo_init_classic_archive_paths()` function, which allows two files to be specified. E.g., `java -XX:AOTCache=static.jsa:dynamic.jsa`. That's because `-XX:AOTCache` was implemented as an alias of `-XX:SharedArchiveFile`, and the latter allows this usage. > > However, this behavior is not specified in JEP 483. Allowing two files in -XX:AOTCache will cause unnecessary complexity when we implement [JDK-8353598: Allow AOT cache to be used in training run](https://bugs.openjdk.org/browse/JDK-8353598). Therefore, I added new test cases to disallow the use of two files. This also means that we don't need to modify the already over-complicated `ergo_init_classic_archive_paths()` for the AOT use cases This pull request has now been integrated. Changeset: 567c6885 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/567c6885a377e5641deef9cd3498f79c5346cd6a Stats: 309 lines in 15 files changed: 161 ins; 55 del; 93 mod 8353597: Refactor handling VM options for AOT cache input and output Reviewed-by: kvn, asmehra ------------- PR: https://git.openjdk.org/jdk/pull/24401 From ihse at openjdk.org Wed Apr 9 15:09:58 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 9 Apr 2025 15:09:58 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v6] In-Reply-To: References: <0MB7FLFNfaGEWssr9X54UJ_iZNFWBJkxQ1yusP7fsuY=.3f9f3de5-fe84-48e6-9449-626cac42da0b@github.com> Message-ID: <_YOUyzMbSEXFduCKVgyis37kwTlGSjBbP8VlFu3xQpU=.9b668e2a-8f91-476d-8914-13dc33a0b9e5@github.com> On Thu, 11 May 2023 20:21:57 GMT, Justin Lu wrote: >> This PR converts Unicode sequences to UTF-8 native in .properties file. (Excluding the Unicode space and tab sequence). The conversion was done using native2ascii. >> >> In addition, the build logic is adjusted to support reading in the .properties files as UTF-8 during the conversion from .properties file to .java ListResourceBundle file. > > Justin Lu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Convert the merged master changes to UTF-8 > - Merge master and fix conflicts > - Close streams when finished loading into props > - Adjust CF test to read in with UTF-8 to fix failing test > - Reconvert CS.properties to UTF-8 > - Revert all changes to CurrencySymbols.properties > - Bug6204853 should not be converted > - Copyright year for CompileProperties > - Redo translation for CS.properties > - Spot convert CurrencySymbols.properties > - ... and 6 more: https://git.openjdk.org/jdk/compare/4386d42d...f15b373a src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties line 22: > 20: # Peter Smolik > 21: Cp1250 WINDOWS-1250 0x00FF > 22: # Patch attributed to havardw at underdusken.no (H?vard Wigtil) This does not seem to have been a correct conversion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12726#discussion_r2035582242 From duke at openjdk.org Wed Apr 9 16:23:45 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 9 Apr 2025 16:23:45 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock In-Reply-To: References: Message-ID: On Tue, 1 Apr 2025 15:29:55 GMT, Stefan Karlsson wrote: >> OK should I update this PR to do the following things: >> - Add comments explaining the asymmetrical locking and warning against patterns that lead to races >> - swapping the order of `NmtVirtualMemoryLocker` and release/uncommit >> - Fail fatally if release/uncommit does not complete. >> >> Or does it make more sense to do that in a different issue/PR? >> >> Also, do we want to keep the new tests and the refactorings (see below)? >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); > >> OK should I update this PR to do the following things: >> >> * Add comments explaining the asymmetrical locking and warning against patterns that lead to races > > Sounds like a good idea. > >> >> * swapping the order of `NmtVirtualMemoryLocker` and release/uncommit > > I wonder if this should be done as new RFE after the change below. It might need a bit of investigation to make sure that the reasoning around this is correct. > >> >> * Fail fatally if release/uncommit does not complete. > > I think this would be a good, separate RFE to be done before we try to swap the order. > >> >> >> Or does it make more sense to do that in a different issue/PR? >> >> Also, do we want to keep the new tests and the refactorings (see below)? >> >> ``` >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> >> To: >> >> ``` >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` > > My thinking is that after you done (2) above, then you will not need to expose the NMT lock to this level. The code would be: > > MemTracker::record_some_operation(addr, bytes); // Lock confined inside this > > pd_unmap_memory(addr, bytes); > > > So, I would wait with this cleanup until we know more about (2). Thank you @stefank for the feedback. I've applied your suggestions. @tstuefe, when you have time, can you please have another look at this? Based on the discussion above, I've reverted the changes to the locking scope in favor of new comments explaining the asymmetrical locking and warning against patterns that lead to races. The new tests that are still relevant are kept. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2790269355 From jiangli at openjdk.org Wed Apr 9 17:14:27 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 9 Apr 2025 17:14:27 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK [v3] In-Reply-To: References: Message-ID: > Please review this PR that changes `LoadAgentDcmdTest.getLibInstrumentPath()` to not locate `libinstrument` shared library if running on static JDK, instead just returns "libinstrument." directly. Both test case #1 and #2 in `LoadAgentDcmdTest.run()` run ok on static JDK with the change: > > Commands: > Test case #1: `JVMTI.agent_load libinstrument.so agent.jar` > Test case #2: `JVMTI.agent_load libinstrument.so "agent.jar=foo=bar"` > > Additional notes/considerations: > > 1. I notice [JDKToolFinder.getJDKTool()](https://github.com/openjdk/jdk/blob/80ff7b9c9406c7845ecb3bc40910e92ccdd23ff2/test/lib/jdk/test/lib/JDKToolFinder.java#L41) is used by the test to find the `jcmd` tool. `JDKToolFinder.getJDKTool()` looks for the requested tool in both `test.jdk` and `compile.jdk`. When running the jtreg test on static JDK, it's able to locate the `jcmd` from `compile.jdk` even though the static JDK binary (`test.jdk`) does not provide the tool. For tools used by jtreg tests at runtime, how about change to always do that? > > 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: > > JVMTI.agent_load [arguments] > Loads JVMTI native agent. > Impact: Low > arguments: > library path: Absolute path of the JVMTI agent to load. (STRING, no default value) > agent option: (Optional) Option string to pass the agent. (STRING, no default value) > > The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: Address @AlanBateman's comment: - Updated copyright header ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24497/files - new: https://git.openjdk.org/jdk/pull/24497/files/3b4ef50c..a0d3f627 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24497&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24497&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24497.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24497/head:pull/24497 PR: https://git.openjdk.org/jdk/pull/24497 From jiangli at openjdk.org Wed Apr 9 17:16:31 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 9 Apr 2025 17:16:31 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK [v2] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 11:00:40 GMT, Alan Bateman wrote: > I assume you've bump the copyright header before integrating. Updated, thanks! @AlanBateman Please help re-approve the PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24497#issuecomment-2790427078 From alanb at openjdk.org Wed Apr 9 17:40:32 2025 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 9 Apr 2025 17:40:32 GMT Subject: RFR: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK [v3] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 17:14:27 GMT, Jiangli Zhou wrote: >> Please review this PR that changes `LoadAgentDcmdTest.getLibInstrumentPath()` to not locate `libinstrument` shared library if running on static JDK, instead just returns "libinstrument." directly. Both test case #1 and #2 in `LoadAgentDcmdTest.run()` run ok on static JDK with the change: >> >> Commands: >> Test case #1: `JVMTI.agent_load libinstrument.so agent.jar` >> Test case #2: `JVMTI.agent_load libinstrument.so "agent.jar=foo=bar"` >> >> Additional notes/considerations: >> >> 1. I notice [JDKToolFinder.getJDKTool()](https://github.com/openjdk/jdk/blob/80ff7b9c9406c7845ecb3bc40910e92ccdd23ff2/test/lib/jdk/test/lib/JDKToolFinder.java#L41) is used by the test to find the `jcmd` tool. `JDKToolFinder.getJDKTool()` looks for the requested tool in both `test.jdk` and `compile.jdk`. When running the jtreg test on static JDK, it's able to locate the `jcmd` from `compile.jdk` even though the static JDK binary (`test.jdk`) does not provide the tool. For tools used by jtreg tests at runtime, how about change to always do that? >> >> 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: >> >> JVMTI.agent_load [arguments] >> Loads JVMTI native agent. >> Impact: Low >> arguments: >> library path: Absolute path of the JVMTI agent to load. (STRING, no default value) >> agent option: (Optional) Option string to pass the agent. (STRING, no default value) >> >> The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. > > Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: > > Address @AlanBateman's comment: > - Updated copyright header Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24497#pullrequestreview-2754244716 From jiangli at openjdk.org Wed Apr 9 17:50:42 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 9 Apr 2025 17:50:42 GMT Subject: Integrated: 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK In-Reply-To: References: Message-ID: On Tue, 8 Apr 2025 02:25:58 GMT, Jiangli Zhou wrote: > Please review this PR that changes `LoadAgentDcmdTest.getLibInstrumentPath()` to not locate `libinstrument` shared library if running on static JDK, instead just returns "libinstrument." directly. Both test case #1 and #2 in `LoadAgentDcmdTest.run()` run ok on static JDK with the change: > > Commands: > Test case #1: `JVMTI.agent_load libinstrument.so agent.jar` > Test case #2: `JVMTI.agent_load libinstrument.so "agent.jar=foo=bar"` > > Additional notes/considerations: > > 1. I notice [JDKToolFinder.getJDKTool()](https://github.com/openjdk/jdk/blob/80ff7b9c9406c7845ecb3bc40910e92ccdd23ff2/test/lib/jdk/test/lib/JDKToolFinder.java#L41) is used by the test to find the `jcmd` tool. `JDKToolFinder.getJDKTool()` looks for the requested tool in both `test.jdk` and `compile.jdk`. When running the jtreg test on static JDK, it's able to locate the `jcmd` from `compile.jdk` even though the static JDK binary (`test.jdk`) does not provide the tool. For tools used by jtreg tests at runtime, how about change to always do that? > > 2. From https://docs.oracle.com/en/java/javase/21/docs/specs/man/jcmd.html: > > JVMTI.agent_load [arguments] > Loads JVMTI native agent. > Impact: Low > arguments: > library path: Absolute path of the JVMTI agent to load. (STRING, no default value) > agent option: (Optional) Option string to pass the agent. (STRING, no default value) > > The command spec requires the absolute path of the JVMTI agent for the `library path` argument. On static JDK, if the agent library is built-in (statically linked), passing the shared library name works and allows the VM to find the built-in agent. There would be no need to specify the absolute path. Please see ?https://bugs.openjdk.org/browse/JDK-8353938?focusedId=14767737&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14767737 for more details. This pull request has now been integrated. Changeset: faacbd96 Author: Jiangli Zhou URL: https://git.openjdk.org/jdk/commit/faacbd96a3dc1116f3af590439585844ff8048a1 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8353938: hotspot/jtreg/serviceability/dcmd/jvmti/LoadAgentDcmdTest.java fails on static JDK Reviewed-by: alanb ------------- PR: https://git.openjdk.org/jdk/pull/24497 From coleenp at openjdk.org Wed Apr 9 19:24:25 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 9 Apr 2025 19:24:25 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 08:14:04 GMT, Serguei Spitsyn wrote: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 So this is another case where you have to ignore JVMTI event like in VTMS transitions? It looks like a good way to fix this in general. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24539#pullrequestreview-2754548627 From lmesnik at openjdk.org Wed Apr 9 19:41:28 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 9 Apr 2025 19:41:28 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 08:14:04 GMT, Serguei Spitsyn wrote: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 Looks good. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24539#pullrequestreview-2754601842 From sgehwolf at openjdk.org Wed Apr 9 19:53:36 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Wed, 9 Apr 2025 19:53:36 GMT Subject: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v17] In-Reply-To: References: Message-ID: <8J81yHp6pB8F9BtQfyXPziqyCCrQP5NSasjiu5HzGWg=.db593ac0-7259-4cda-832d-1e872114db44@github.com> On Tue, 1 Apr 2025 14:59:37 GMT, Severin Gehwolf wrote: >> Please review this fix for cgroups-based metrics reporting in the `jdk.internal.platform` package. This fix is supposed to address wrong reporting of certain limits if the limits aren't set at the leaf nodes. >> >> For example, on cg v2, the memory limit interface file is `memory.max`. Consider a cgroup path of `/a/b/c/d`. The current code only reports the limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, some systems - like a systemd slice - sets those limits further up the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report the actual limit value (e.g. `1048576000`). >> >> This patch addresses this issue by: >> >> 1. Refactoring the interface lookup code to relevant controllers for cpu/memory. The CgroupSubsystem classes then delegate to those for the lookup. This facilitates having an API for the lookup of an updated limit in step 2. >> 2. Walking the full hierarchy of the cgroup path (if any), looking for a lower limit than at the leaf. Note that it's not possible to raise the limit set at a path closer to the root via the interface file at a further-to-the-leaf-level. The odd case out seems to be `max` values on some systems (which seems to be the default value). >> >> As an optimization this hierarchy walk is skipped on containerized systems (like K8S), where the limits are set in interface files at the leaf nodes of the hierarchy. Therefore there should be no change on those systems. >> >> This patch depends on the Hotspot change implementing the same for the JVM so that `Metrics.isContainerized()` works correctly on affected systems where `-XshowSettings:system` currently reports `System not containerized` due to the missing JVM fix. A test framework for such hierarchical systems has been added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds a test using that framework among some simpler unit tests. >> >> Thoughts? >> >> Testing: >> >> - [x] GHA >> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems >> - [x] Some manual testing using systemd slices > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 44 commits: > > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - JDK-8350103 > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Fix missing imports > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - ... and 34 more: https://git.openjdk.org/jdk/compare/6801eb87...32960cd6 Keep open, please. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2790835637 From amenkov at openjdk.org Wed Apr 9 20:18:33 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 9 Apr 2025 20:18:33 GMT Subject: RFR: 8353485: Jcms should allow to specify streaming_output mode In-Reply-To: References: Message-ID: On Mon, 7 Apr 2025 17:59:24 GMT, Alex Menkov wrote: > The fix adds `--streaming_output` jcmd option to manage attach command streaming output. > > Testing: tier1..tier4,hs-tier5-svc Streaming output for attach client can be set by "jdk.attach.allowStreamingOutput" property, passing the value by using launcher VM args ("-J"). Closing the issue ------------- PR Comment: https://git.openjdk.org/jdk/pull/24494#issuecomment-2790896487 From amenkov at openjdk.org Wed Apr 9 20:18:33 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 9 Apr 2025 20:18:33 GMT Subject: Withdrawn: 8353485: Jcms should allow to specify streaming_output mode In-Reply-To: References: Message-ID: <_7PY4MvTI4Dn20hFlIvafw7ioRzjYOIcHpeWluuew2I=.220333cd-f3ea-4273-a3b3-6329cf47acbc@github.com> On Mon, 7 Apr 2025 17:59:24 GMT, Alex Menkov wrote: > The fix adds `--streaming_output` jcmd option to manage attach command streaming output. > > Testing: tier1..tier4,hs-tier5-svc This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24494 From cjplummer at openjdk.org Wed Apr 9 20:28:33 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 9 Apr 2025 20:28:33 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 14:17:25 GMT, Matthias Baesken wrote: > Making hotspot `-Os` would certainly never make any sense, for example.) When the minimalVM was first created, there was a quite of bit of benchmarking done to decide which hotspot files to build with -Os and which to build with -O3. I'm not sure if this split survived the new build system, but for those wanting a truly minimal footprint with reasonable performance sacrifices, it still makes sense to build all or most of hotspot with -Os. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2790923666 From cjplummer at openjdk.org Wed Apr 9 20:37:33 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 9 Apr 2025 20:37:33 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 10:41:34 GMT, Matthias Baesken wrote: > Wait, `LOW` is `-O2`? ? I thought it was like no optimization at all. I'm sooo confused with these levels. So maybe going from `LOW` to `SIZE` will actually lose more optimization than I thought. _sigh_ -Os is the same as -O2 minus some optimizations that increase footprint. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html -O3 mostly adds more serious footprint increasing optimizations like loop unrolling. I've seen a number of warnings that suggest in some cases the larger code can reduce performance, so you may be best off with -O2 or -Os for best performance. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2790938638 From cjplummer at openjdk.org Wed Apr 9 20:45:32 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 9 Apr 2025 20:45:32 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 08:14:04 GMT, Serguei Spitsyn wrote: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 src/hotspot/share/prims/jvmtiEnvBase.cpp line 867: > 865: // This call collects the strong and weak groups > 866: JavaThread* THREAD = current_thread; > 867: JvmtiJavaUpcallMark jjum(current_thread); Add comment like you did above for the JvmtiEnv::InterruptThread case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24539#discussion_r2036109049 From cjplummer at openjdk.org Wed Apr 9 20:52:48 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 9 Apr 2025 20:52:48 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 08:14:04 GMT, Serguei Spitsyn wrote: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 src/hotspot/share/prims/jvmtiExport.cpp line 1419: > 1417: // ClassPrepare events are important for JDWP agent but not expected during such upcalls. > 1418: // Catch if this invariant is not broken. > 1419: assert(!thread->is_in_java_upcall(), "unexpected ClassPrepare event during JVMTI upcall"); I think we should do this for ClassLoad also. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24539#discussion_r2036122548 From jlu at openjdk.org Wed Apr 9 21:28:41 2025 From: jlu at openjdk.org (Justin Lu) Date: Wed, 9 Apr 2025 21:28:41 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v6] In-Reply-To: <_YOUyzMbSEXFduCKVgyis37kwTlGSjBbP8VlFu3xQpU=.9b668e2a-8f91-476d-8914-13dc33a0b9e5@github.com> References: <0MB7FLFNfaGEWssr9X54UJ_iZNFWBJkxQ1yusP7fsuY=.3f9f3de5-fe84-48e6-9449-626cac42da0b@github.com> <_YOUyzMbSEXFduCKVgyis37kwTlGSjBbP8VlFu3xQpU=.9b668e2a-8f91-476d-8914-13dc33a0b9e5@github.com> Message-ID: On Wed, 9 Apr 2025 15:06:32 GMT, Magnus Ihse Bursie wrote: >> Justin Lu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: >> >> - Convert the merged master changes to UTF-8 >> - Merge master and fix conflicts >> - Close streams when finished loading into props >> - Adjust CF test to read in with UTF-8 to fix failing test >> - Reconvert CS.properties to UTF-8 >> - Revert all changes to CurrencySymbols.properties >> - Bug6204853 should not be converted >> - Copyright year for CompileProperties >> - Redo translation for CS.properties >> - Spot convert CurrencySymbols.properties >> - ... and 6 more: https://git.openjdk.org/jdk/compare/4386d42d...f15b373a > > src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties line 22: > >> 20: # Peter Smolik >> 21: Cp1250 WINDOWS-1250 0x00FF >> 22: # Patch attributed to havardw at underdusken.no (H?vard Wigtil) > > This does not seem to have been a correct conversion. Right, that `?` looks to have been incorrectly converted during the ISO-8859-1 to UTF-8 conversion. (I can't find the script used for conversion as this change is from some time ago.) Since the change occurs in a comment (thankfully), it should be harmless and the next upstream update of this file would overwrite this incorrect change. However, this file does not seem to be updated that often, so I can also file an issue to correct this if you would prefer that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12726#discussion_r2036165417 From sspitsyn at openjdk.org Wed Apr 9 22:37:27 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 9 Apr 2025 22:37:27 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 20:38:08 GMT, Chris Plummer wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 867: > >> 865: // This call collects the strong and weak groups >> 866: JavaThread* THREAD = current_thread; >> 867: JvmtiJavaUpcallMark jjum(current_thread); > > Add comment like you did above for the JvmtiEnv::InterruptThread case. Okay, thanks! Added now. > src/hotspot/share/prims/jvmtiExport.cpp line 1419: > >> 1417: // ClassPrepare events are important for JDWP agent but not expected during such upcalls. >> 1418: // Catch if this invariant is not broken. >> 1419: assert(!thread->is_in_java_upcall(), "unexpected ClassPrepare event during JVMTI upcall"); > > I think we should do this for ClassLoad also. Okay, thanks! Updated now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24539#discussion_r2036227345 PR Review Comment: https://git.openjdk.org/jdk/pull/24539#discussion_r2036227287 From sspitsyn at openjdk.org Wed Apr 9 22:41:05 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 9 Apr 2025 22:41:05 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v2] In-Reply-To: References: Message-ID: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: 1) added an assert for ClassLoad events same as for ClassPrepare; 2) added one minor comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24539/files - new: https://git.openjdk.org/jdk/pull/24539/files/82e99fae..342e8a78 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24539&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24539&range=00-01 Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24539/head:pull/24539 PR: https://git.openjdk.org/jdk/pull/24539 From sspitsyn at openjdk.org Wed Apr 9 22:41:05 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 9 Apr 2025 22:41:05 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: <-h8tEhjnCVtV4HP9CvhZZEnYUvyJyehZ8aZV44m1bD4=.9db1813e-225e-4352-b09f-2042fdf8fbdf@github.com> On Wed, 9 Apr 2025 08:14:04 GMT, Serguei Spitsyn wrote: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 Coleen and Leonid, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24539#issuecomment-2791132661 From sspitsyn at openjdk.org Wed Apr 9 22:47:24 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 9 Apr 2025 22:47:24 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v2] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 19:21:21 GMT, Coleen Phillimore wrote: > So this is another case where you have to ignore JVMTI event like in VTMS transitions? It looks like a good way to fix this in general. Yes. This is a long standing issue which is good to fix now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24539#issuecomment-2791140522 From cjplummer at openjdk.org Wed Apr 9 23:29:33 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 9 Apr 2025 23:29:33 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v2] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 22:34:47 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiExport.cpp line 1419: >> >>> 1417: // ClassPrepare events are important for JDWP agent but not expected during such upcalls. >>> 1418: // Catch if this invariant is not broken. >>> 1419: assert(!thread->is_in_java_upcall(), "unexpected ClassPrepare event during JVMTI upcall"); >> >> I think we should do this for ClassLoad also. > > Okay, thanks! Updated now. > Catch if this invariant is not broken I think you mean "Catch if this invariant is broken" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24539#discussion_r2036269670 From fyang at openjdk.org Thu Apr 10 02:16:36 2025 From: fyang at openjdk.org (Fei Yang) Date: Thu, 10 Apr 2025 02:16:36 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v3] In-Reply-To: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> References: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> Message-ID: On Mon, 31 Mar 2025 10:45:54 GMT, Robbin Ehn wrote: >> Hi, for you to consider. >> >> These tests constantly fails in qemu-user. >> Either the require host to be same arch explicit or implicit (sysroot). >> E.g. "ptrace(PTRACE_ATTACH, ..) failed for 405157: Function not implemented'" for SA tests. >> >> From bug: >>> qemu-user/rv64 sets uarch to "qemu" in /proc/cpuinfo (qemu-system do not do that). >>> We add this uarch to CPU feature string. >>> This means we can use jtreg 'require' with cpu string to filter out tests in qemu-user. >> >> Relevant qemu code: >> https://github.com/qemu/qemu/blob/170825d14d88a1ce7fae98d5a928480f2f329b22/linux-user/riscv/target_proc.h#L29 >> >> Relevant hotspot code: >> https://github.com/openjdk/jdk/blob/fa0b18bfde38ee2ffbab33a9eaac547fe8aa3c7c/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L250 >> >> Tested that the require only filters out tests in qemu+riscv64. >> >> Thanks! >> >> /Robbin > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into qemu-user-issues > - Revert > - Merge branch 'master' into qemu-user-issues > - Merge branch 'master' into qemu-user-issues > - more > - more > - native or very long > qemu-user, "uarch: qemu" in cpuinfo: `[0.084s][info ][os,cpu] CPU: total 28 (initial active 28) qemu rv64 rvi rvm rva rvf rvd rvc rvv zba zbb zbs zfh zfhmin zvbc zvfh zicond` Hence we know this is qemu-user (only qemu-user sets uarch to qemu on riscv). > > `/proc/cpuinfo` do not contain uarch: [0.053s][info ][os,cpu] CPU: total 8 (initial active 8) rv64 rvi rvm rva rvf rvd rvc zba zbb zbs zfh zfhmin zvfh zicond We have no clue if this is emulated or on real hardware, tests will be executed. > > Tests are only excluded if we know it's qemu-user. > qemu-user, "uarch: qemu" in cpuinfo: `[0.084s][info ][os,cpu] CPU: total 28 (initial active 28) qemu rv64 rvi rvm rva rvf rvd rvc rvv zba zbb zbs zfh zfhmin zvbc zvfh zicond` Hence we know this is qemu-user (only qemu-user sets uarch to qemu on riscv). > > `/proc/cpuinfo` do not contain uarch: [0.053s][info ][os,cpu] CPU: total 8 (initial active 8) rv64 rvi rvm rva rvf rvd rvc zba zbb zbs zfh zfhmin zvfh zicond We have no clue if this is emulated or on real hardware, tests will be executed. > > Tests are only excluded if we know it's qemu-user. Sorry for not being clear enough. Yes, that's how it works with qemu-user for riscv. Just wondering if it makes sense to extend this to other CPU platforms. There are two cases. - Case 1: The tests are excluded as expected if we parses "qemu" in cpuinfo with qemu-user for another CPU, which is simiar with qemu-user for riscv. But I am not sure if there is one for now. - Case 2: The tests are NOT excluded as there's no "qemu" in cpuinfo with qemu-user for another CPU. Then we still got test failures as before. But we are not causing any more regressions. I may consider that as a qemu-user issue for this CPU. And it could be fixed on the qemu-user side if it really helps people. Maybe I am demanding too much about qemu-user. What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2791376594 From sspitsyn at openjdk.org Thu Apr 10 05:50:23 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 05:50:23 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v3] In-Reply-To: References: Message-ID: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: minor tweak in two similar comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24539/files - new: https://git.openjdk.org/jdk/pull/24539/files/342e8a78..f0b70372 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24539&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24539&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24539/head:pull/24539 PR: https://git.openjdk.org/jdk/pull/24539 From sspitsyn at openjdk.org Thu Apr 10 05:50:23 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 05:50:23 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v3] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 23:27:00 GMT, Chris Plummer wrote: >> Okay, thanks! Updated now. > >> Catch if this invariant is not broken > I think you mean "Catch if this invariant is broken" Okay, thanks! Fixed this comment now in both places. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24539#discussion_r2036550047 From aboldtch at openjdk.org Thu Apr 10 06:18:38 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 10 Apr 2025 06:18:38 GMT Subject: RFR: 8350441: ZGC: Overhaul Page Allocation In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 13:37:16 GMT, Joel Sikstr?m wrote: >> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. > > # Background > > This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. > > In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. > > # Mapped Cache > > The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). > > The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. > > # Fragmentation > > Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. > > ## Virtual Memory Shuffling > > In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with virtual memory. When harvesting memory, whic... src/hotspot/share/gc/z/zVirtualMemoryManager.hpp line 89: > 87: ZVirtualMemoryManager(size_t max_capacity); > 88: > 89: void initialize_partitions(ZVirtualMemoryReserver* reserver, size_t reserved); Suggestion: void initialize_partitions(ZVirtualMemoryReserver* reserver, size_t size_for_partitions); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24547#discussion_r2036582513 From sspitsyn at openjdk.org Thu Apr 10 06:41:42 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 06:41:42 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v4] In-Reply-To: References: Message-ID: > This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. > > Testing: Ran mach5 tiers 1-6. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: added general comment about sync between suspend_thread and resume_thread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24269/files - new: https://git.openjdk.org/jdk/pull/24269/files/4a92986a..df99ba15 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24269&range=02-03 Stats: 19 lines in 3 files changed: 16 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24269.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24269/head:pull/24269 PR: https://git.openjdk.org/jdk/pull/24269 From sspitsyn at openjdk.org Thu Apr 10 06:41:42 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 06:41:42 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v2] In-Reply-To: References: Message-ID: On Fri, 4 Apr 2025 06:21:12 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1759: >> >>> 1757: Handle thread_h(current, thread_oop); >>> 1758: bool is_virtual = java_lang_VirtualThread::is_instance(thread_h()); >>> 1759: bool is_thread_carrying = is_thread_carrying_vthread(java_thread, thread_h()); >> >> I think that somewhere in this place should be an explanation of suspend<->resume synchronization. As I understand the hadshake can't be executed and clear suspend state while suspend_thread is done for the same thread. How it is guaranteed that suspend_thread flag cann't be updated? >> It is not obvious and also put some restrictions on the suspend_thread implementation to keep this behaviour. > > Thank you for reviewing and this suggestion. > Yes, you are right. I'll try to find a good place to add such a comment. I've added a comment you requested. Please, let me know if it is enough or there are some comments/suggestions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24269#discussion_r2036612988 From rehn at openjdk.org Thu Apr 10 07:08:39 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 10 Apr 2025 07:08:39 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v3] In-Reply-To: References: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> Message-ID: On Thu, 10 Apr 2025 02:13:46 GMT, Fei Yang wrote: > > qemu-user, "uarch: qemu" in cpuinfo: `[0.084s][info ][os,cpu] CPU: total 28 (initial active 28) qemu rv64 rvi rvm rva rvf rvd rvc rvv zba zbb zbs zfh zfhmin zvbc zvfh zicond` Hence we know this is qemu-user (only qemu-user sets uarch to qemu on riscv). > > `/proc/cpuinfo` do not contain uarch: [0.053s][info ][os,cpu] CPU: total 8 (initial active 8) rv64 rvi rvm rva rvf rvd rvc zba zbb zbs zfh zfhmin zvfh zicond We have no clue if this is emulated or on real hardware, tests will be executed. > > Tests are only excluded if we know it's qemu-user. > > > qemu-user, "uarch: qemu" in cpuinfo: `[0.084s][info ][os,cpu] CPU: total 28 (initial active 28) qemu rv64 rvi rvm rva rvf rvd rvc rvv zba zbb zbs zfh zfhmin zvbc zvfh zicond` Hence we know this is qemu-user (only qemu-user sets uarch to qemu on riscv). > > `/proc/cpuinfo` do not contain uarch: [0.053s][info ][os,cpu] CPU: total 8 (initial active 8) rv64 rvi rvm rva rvf rvd rvc zba zbb zbs zfh zfhmin zvfh zicond We have no clue if this is emulated or on real hardware, tests will be executed. > > Tests are only excluded if we know it's qemu-user. > > Sorry for not being clear enough. Yes, that's how it works with qemu-user for riscv. Just wondering if it makes sense to extend this to other CPU platforms. There are two cases. > > * Case 1: The tests are excluded as expected if we parses "qemu" in cpuinfo with qemu-user for another CPU, which is simiar with qemu-user for riscv. But I am not sure if there is one for now. > * Case 2: The tests are NOT excluded as there's no "qemu" in cpuinfo with qemu-user for another CPU. Then we still got test failures as before. But we are not causing any more regressions. I may consider that as a qemu-user issue for this CPU. And it could be fixed on the qemu-user side if it really helps people. > > Maybe I am demanding too much about qemu-user. What do you think? There is additional step: The linux cpu vm_version also need to parse the /proc/cpuinfo and add that to the JVM cpu string. Right now only rv64 and aarch64 opens that file AFIACT. And in qemu-user the only JVM supported platforms adding qemu to cpuinfo is s390 and rv64. I'll ask qemu folks and get a feel for if I can upstream some changes addressing this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2791764329 From mbaesken at openjdk.org Thu Apr 10 07:15:30 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 10 Apr 2025 07:15:30 GMT Subject: RFR: 8349638: Build libjdwp with SIZE optimization In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 20:26:13 GMT, Chris Plummer wrote: > > Making hotspot `-Os` would certainly never make any sense, for example.) > > When the minimalVM was first created, there was a quite of bit of benchmarking done to decide which hotspot files to build with -Os and which to build with -O3. I'm not sure if this split survived the new build system, but for those wanting a truly minimal footprint with reasonable performance sacrifices, it still makes sense to build all or most of hotspot with -Os. The list to decide which hotspot files to build with -Os and which to build with -O3 (OPT_SPEED_SRC ) is still present , see https://github.com/openjdk/jdk/blob/master/make/hotspot/lib/JvmFeatures.gmk#L197 . Probably the list is a bit outdated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23563#issuecomment-2791779866 From ihse at openjdk.org Thu Apr 10 07:34:37 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 10 Apr 2025 07:34:37 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v6] In-Reply-To: References: <0MB7FLFNfaGEWssr9X54UJ_iZNFWBJkxQ1yusP7fsuY=.3f9f3de5-fe84-48e6-9449-626cac42da0b@github.com> <_YOUyzMbSEXFduCKVgyis37kwTlGSjBbP8VlFu3xQpU=.9b668e2a-8f91-476d-8914-13dc33a0b9e5@github.com> Message-ID: On Wed, 9 Apr 2025 21:26:15 GMT, Justin Lu wrote: >> src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties line 22: >> >>> 20: # Peter Smolik >>> 21: Cp1250 WINDOWS-1250 0x00FF >>> 22: # Patch attributed to havardw at underdusken.no (H?vard Wigtil) >> >> This does not seem to have been a correct conversion. > > Right, that `?` looks to have been incorrectly converted during the ISO-8859-1 to UTF-8 conversion. (I can't find the script used for conversion as this change is from some time ago.) > > Since the change occurs in a comment (thankfully), it should be harmless and the next upstream update of this file would overwrite this incorrect change. However, this file does not seem to be updated that often, so I can also file an issue to correct this if you would prefer that. You don't have to do that, I'm working on an omnibus UTF-8 fixing PR right now, where I will include a fix for this as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12726#discussion_r2036695622 From ihse at openjdk.org Thu Apr 10 07:34:37 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 10 Apr 2025 07:34:37 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v6] In-Reply-To: References: <0MB7FLFNfaGEWssr9X54UJ_iZNFWBJkxQ1yusP7fsuY=.3f9f3de5-fe84-48e6-9449-626cac42da0b@github.com> <_YOUyzMbSEXFduCKVgyis37kwTlGSjBbP8VlFu3xQpU=.9b668e2a-8f91-476d-8914-13dc33a0b9e5@github.com> Message-ID: On Thu, 10 Apr 2025 07:31:37 GMT, Magnus Ihse Bursie wrote: >> Right, that `?` looks to have been incorrectly converted during the ISO-8859-1 to UTF-8 conversion. (I can't find the script used for conversion as this change is from some time ago.) >> >> Since the change occurs in a comment (thankfully), it should be harmless and the next upstream update of this file would overwrite this incorrect change. However, this file does not seem to be updated that often, so I can also file an issue to correct this if you would prefer that. > > You don't have to do that, I'm working on an omnibus UTF-8 fixing PR right now, where I will include a fix for this as well. If anything, I might be a bit worried that there are more incorrect conversions stemming from this PR, that my automated tools and manual scanning has not revealed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12726#discussion_r2036696723 From dholmes at openjdk.org Thu Apr 10 07:37:26 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 10 Apr 2025 07:37:26 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v3] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 05:50:23 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: minor tweak in two similar comments This looks reasonable to me. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24539#pullrequestreview-2755643289 From eirbjo at openjdk.org Thu Apr 10 08:10:42 2025 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Thu, 10 Apr 2025 08:10:42 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v6] In-Reply-To: References: <0MB7FLFNfaGEWssr9X54UJ_iZNFWBJkxQ1yusP7fsuY=.3f9f3de5-fe84-48e6-9449-626cac42da0b@github.com> <_YOUyzMbSEXFduCKVgyis37kwTlGSjBbP8VlFu3xQpU=.9b668e2a-8f91-476d-8914-13dc33a0b9e5@github.com> Message-ID: <6c6DqyCqyPonBZgUU8BpYJR3JQvMXjWm9ulq4SN25Do=.77775825-716d-4908-ae24-c4cf1ead78a5@github.com> On Thu, 10 Apr 2025 07:32:18 GMT, Magnus Ihse Bursie wrote: >> You don't have to do that, I'm working on an omnibus UTF-8 fixing PR right now, where I will include a fix for this as well. > > If anything, I might be a bit worried that there are more incorrect conversions stemming from this PR, that my automated tools and manual scanning has not revealed. Some observations: 1: This PR seems to have been abondoned, so perhaps this discussion belongs in #15694 ? 2: The `?` (Unicode 'Latin small letter a with ring above' U+00E5) was correctly encoded as 0xEF in ISO-8859-1 previous to this change. 3: The conversion changed this `0xEF` to the three-byte sequence `ef bf bd` 4: This is as-if the file was incorrctly decoded using UTF-8, then encoded using UTF-8: byte[] origBytes = "?".getBytes(StandardCharsets.ISO_8859_1); String decoded = new String(origBytes, StandardCharsets.UTF_8); byte[] encoded = decoded.getBytes(StandardCharsets.UTF_8); String hex = HexFormat.of().formatHex(encoded); assertEquals("efbfbd", hex); ``` Like @magicus I'm worried that similar incorrect decoding could have been introduced by the same script in other files. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12726#discussion_r2036767319 From ihse at openjdk.org Thu Apr 10 08:38:38 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 10 Apr 2025 08:38:38 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v6] In-Reply-To: <6c6DqyCqyPonBZgUU8BpYJR3JQvMXjWm9ulq4SN25Do=.77775825-716d-4908-ae24-c4cf1ead78a5@github.com> References: <0MB7FLFNfaGEWssr9X54UJ_iZNFWBJkxQ1yusP7fsuY=.3f9f3de5-fe84-48e6-9449-626cac42da0b@github.com> <_YOUyzMbSEXFduCKVgyis37kwTlGSjBbP8VlFu3xQpU=.9b668e2a-8f91-476d-8914-13dc33a0b9e5@github.com> <6c6DqyCqyPonBZgUU8BpYJR3JQvMXjWm9ulq4SN25Do=.77775825-716d-4908-ae24-c4cf1ead78a5@github.com> Message-ID: On Thu, 10 Apr 2025 08:08:02 GMT, Eirik Bj?rsn?s wrote: >> If anything, I might be a bit worried that there are more incorrect conversions stemming from this PR, that my automated tools and manual scanning has not revealed. > > Some observations: > > 1: This PR seems to have been abondoned, so perhaps this discussion belongs in #15694 ? > > 2: The `?` (Unicode 'Latin small letter a with ring above' U+00E5) was correctly encoded as 0xEF in ISO-8859-1 previous to this change. > > 3: The conversion changed this `0xEF` to the three-byte sequence `ef bf bd` > > 4: This is as-if the file was incorrctly decoded using UTF-8, then encoded using UTF-8: > > > byte[] origBytes = "?".getBytes(StandardCharsets.ISO_8859_1); > String decoded = new String(origBytes, StandardCharsets.UTF_8); > byte[] encoded = decoded.getBytes(StandardCharsets.UTF_8); > String hex = HexFormat.of().formatHex(encoded); > assertEquals("efbfbd", hex); > ``` > > Like @magicus I'm worried that similar incorrect decoding could have been introduced by the same script in other files. > This PR seems to have been abondoned, so perhaps this discussion belongs in https://github.com/openjdk/jdk/pull/15694 ? Oh, I didn't notice this was supplanted by another PR. It might be better to continue there, yes. Even if closed PRs seldom are the best places to conduct discussions, I think it might be a good idea to scrutinize all files modified by this script. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12726#discussion_r2036820765 From ihse at openjdk.org Thu Apr 10 08:41:45 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 10 Apr 2025 08:41:45 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v2] In-Reply-To: References: Message-ID: On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu wrote: >> JDK .properties files still use ISO-8859-1 encoding with escape sequences. It would improve readability to see the native characters instead of escape sequences (especially for the L10n process). The majority of files changed are localized resource files. >> >> This change converts the Unicode escape sequences in the JDK .properties files (both in src and test) to UTF-8 native characters. Additionally, the build logic is adjusted to read the .properties files in UTF-8 while generating the ListResourceBundle files. >> >> The only escape sequence not converted was `\u0020` as this is used to denote intentional trailing white space. (E.g. `key=This is the value:\u0020`) >> >> The conversion was done using native2ascii with options `-reverse -encoding UTF-8`. >> >> If this PR is integrated, the IDE default encoding for .properties files need to be updated to UTF-8. (IntelliJ IDEA locks .properties files as ISO-8859-1 unless manually changed). > > Justin Lu has updated the pull request incrementally with one additional commit since the last revision: > > Replace InputStreamReader with BufferedReader Continuing the discussion that was started at a predecessor to this PR, https://github.com/openjdk/jdk/pull/12726#discussion_r2035582242. At least one incorrect conversion has been found in this PR. It might be worthwhile to double- and triple-check all the other conversions as well. As part of https://bugs.openjdk.org/browse/JDK-8301971 I am trying various ways of detecting files without UTF-8 encoding, but it is still a bit of hit and miss, since there are no surefire way of telling which encoding a file has, only heuristics. So finding and following up potential sources of error is important. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15694#issuecomment-2791991649 PR Comment: https://git.openjdk.org/jdk/pull/15694#issuecomment-2791997157 From eirbjo at openjdk.org Thu Apr 10 08:48:37 2025 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Thu, 10 Apr 2025 08:48:37 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v2] In-Reply-To: References: Message-ID: <0q0gTsqIsYtmzAfNYbBXksUXKdZh2uzQ9yvSETKAP88=.137372e6-d63e-4539-b196-4bd9ef1ddd16@github.com> On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu wrote: >> JDK .properties files still use ISO-8859-1 encoding with escape sequences. It would improve readability to see the native characters instead of escape sequences (especially for the L10n process). The majority of files changed are localized resource files. >> >> This change converts the Unicode escape sequences in the JDK .properties files (both in src and test) to UTF-8 native characters. Additionally, the build logic is adjusted to read the .properties files in UTF-8 while generating the ListResourceBundle files. >> >> The only escape sequence not converted was `\u0020` as this is used to denote intentional trailing white space. (E.g. `key=This is the value:\u0020`) >> >> The conversion was done using native2ascii with options `-reverse -encoding UTF-8`. >> >> If this PR is integrated, the IDE default encoding for .properties files need to be updated to UTF-8. (IntelliJ IDEA locks .properties files as ISO-8859-1 unless manually changed). > > Justin Lu has updated the pull request incrementally with one additional commit since the last revision: > > Replace InputStreamReader with BufferedReader FWIW, I checked out the revision of the commit previous to this change and found the following: % git checkout b55e418a077791b39992042411cde97f68dc39fe^ % find src -name "*.properties" | xargs file | grep -v ASCII src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties: ISO-8859 text src/java.xml.crypto/share/classes/com/sun/org/apache/xml/internal/security/resource/xmlsecurity_de.properties: Unicode text, UTF-8 text, with very long lines (322) Which indicates that that this is the only non-ASCII, non-UTF-8 property file. So we may be lucky. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15694#issuecomment-2792014164 From jsikstro at openjdk.org Thu Apr 10 09:20:58 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 10 Apr 2025 09:20:58 GMT Subject: RFR: 8350441: ZGC: Overhaul Page Allocation [v2] In-Reply-To: References: Message-ID: >> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. > > # Background > > This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. > > In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. > > # Mapped Cache > > The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). > > The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. > > # Fragmentation > > Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. > > ## Virtual Memory Shuffling > > In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with virtual memory. When harvesting memory, whic... Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/share/gc/z/zVirtualMemoryManager.hpp Co-authored-by: Axel Boldt-Christmas ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24547/files - new: https://git.openjdk.org/jdk/pull/24547/files/5f9caa7a..ea2b5f97 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24547&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24547&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24547.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24547/head:pull/24547 PR: https://git.openjdk.org/jdk/pull/24547 From aboldtch at openjdk.org Thu Apr 10 09:20:58 2025 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 10 Apr 2025 09:20:58 GMT Subject: RFR: 8350441: ZGC: Overhaul Page Allocation [v2] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 09:16:56 GMT, Joel Sikstr?m wrote: >>> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. >> >> # Background >> >> This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. >> >> In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. >> >> # Mapped Cache >> >> The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). >> >> The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. >> >> # Fragmentation >> >> Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. >> >> ## Virtual Memory Shuffling >> >> In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with vir... > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Update src/hotspot/share/gc/z/zVirtualMemoryManager.hpp > > Co-authored-by: Axel Boldt-Christmas lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24547#pullrequestreview-2755960462 From stefank at openjdk.org Thu Apr 10 09:20:58 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 10 Apr 2025 09:20:58 GMT Subject: RFR: 8350441: ZGC: Overhaul Page Allocation [v2] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 09:16:56 GMT, Joel Sikstr?m wrote: >>> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. >> >> # Background >> >> This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. >> >> In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. >> >> # Mapped Cache >> >> The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). >> >> The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. >> >> # Fragmentation >> >> Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. >> >> ## Virtual Memory Shuffling >> >> In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with vir... > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Update src/hotspot/share/gc/z/zVirtualMemoryManager.hpp > > Co-authored-by: Axel Boldt-Christmas Looks good! Marked as reviewed by stefank (Reviewer). ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24547#pullrequestreview-2755953546 PR Review: https://git.openjdk.org/jdk/pull/24547#pullrequestreview-2755957917 From eosterlund at openjdk.org Thu Apr 10 09:31:37 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 10 Apr 2025 09:31:37 GMT Subject: RFR: 8350441: ZGC: Overhaul Page Allocation [v2] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 09:20:58 GMT, Joel Sikstr?m wrote: >>> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. >> >> # Background >> >> This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. >> >> In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. >> >> # Mapped Cache >> >> The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). >> >> The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. >> >> # Fragmentation >> >> Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. >> >> ## Virtual Memory Shuffling >> >> In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with vir... > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Update src/hotspot/share/gc/z/zVirtualMemoryManager.hpp > > Co-authored-by: Axel Boldt-Christmas Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24547#pullrequestreview-2756002581 From ihse at openjdk.org Thu Apr 10 09:45:56 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 10 Apr 2025 09:45:56 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v2] In-Reply-To: References: Message-ID: On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu wrote: >> JDK .properties files still use ISO-8859-1 encoding with escape sequences. It would improve readability to see the native characters instead of escape sequences (especially for the L10n process). The majority of files changed are localized resource files. >> >> This change converts the Unicode escape sequences in the JDK .properties files (both in src and test) to UTF-8 native characters. Additionally, the build logic is adjusted to read the .properties files in UTF-8 while generating the ListResourceBundle files. >> >> The only escape sequence not converted was `\u0020` as this is used to denote intentional trailing white space. (E.g. `key=This is the value:\u0020`) >> >> The conversion was done using native2ascii with options `-reverse -encoding UTF-8`. >> >> If this PR is integrated, the IDE default encoding for .properties files need to be updated to UTF-8. (IntelliJ IDEA locks .properties files as ISO-8859-1 unless manually changed). > > Justin Lu has updated the pull request incrementally with one additional commit since the last revision: > > Replace InputStreamReader with BufferedReader Thanks for checking! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15694#issuecomment-2792170460 From jsikstro at openjdk.org Thu Apr 10 11:40:51 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 10 Apr 2025 11:40:51 GMT Subject: RFR: 8350441: ZGC: Overhaul Page Allocation [v2] In-Reply-To: References: Message-ID: <2lZljOaRU4QYWP-iNYwwFNazf7Z6Jh9MrYxc4QSsNmE=.cae1bf9f-c0af-4c82-ac4b-2ca7a401f6c0@github.com> On Thu, 10 Apr 2025 09:20:58 GMT, Joel Sikstr?m wrote: >>> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. >> >> # Background >> >> This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. >> >> In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. >> >> # Mapped Cache >> >> The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). >> >> The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. >> >> # Fragmentation >> >> Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. >> >> ## Virtual Memory Shuffling >> >> In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with vir... > > Joel Sikstr?m has updated the pull request incrementally with one additional commit since the last revision: > > Update src/hotspot/share/gc/z/zVirtualMemoryManager.hpp > > Co-authored-by: Axel Boldt-Christmas Thank you to everyone who contributed to, helped out, and reviewed this patch. I am really happy with how this turned out and all the things I've learned along the way :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24547#issuecomment-2792450995 From jsikstro at openjdk.org Thu Apr 10 11:40:51 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Thu, 10 Apr 2025 11:40:51 GMT Subject: Integrated: 8350441: ZGC: Overhaul Page Allocation In-Reply-To: References: Message-ID: <5hnAuZLpGYPMwR594K0ftHzhCa4PWvwrVMEneZnJ9v4=.f894e7f2-8733-49ae-a875-0e1a6b5ff075@github.com> On Wed, 9 Apr 2025 13:37:16 GMT, Joel Sikstr?m wrote: >> Note that any reference to pages from here on out refers to the concept of a heap region in ZGC, not pages in the operating system (OS), unless stated otherwise. > > # Background > > This PR addresses fragmentation by introducing a Mapped Cache that replaces the Page Cache in ZGC. The largest limitation of the Page Cache is that it is constrained by the abstraction of what a page is. The proposed Mapped Cache removes this limitation by decoupling memory from pages, allowing it to merge and split memory in ways that the Page Cache is not suited for. To facilitate the transition, much of the Page Allocator has been redesigned to work with the Mapped Cache. > > In addition to fighting fragmentation, the new approach improves NUMA-support and simplifies memory unampping. Combined, these changes lay the foundation for even more improvements in ZGC, like replacing multi-mapped memory with anonymous memory. > > # Mapped Cache > > The main benefit of the Mapped Cache is that adjacent virtual memory ranges in the cache can be merged to create larger ranges, enabling larger allocation requests to succeed more easily. Most notably, it allows allocations to succeed more often without "harvesting" smaller, discontiguous ranges. Harvesting negatively impacts both fragmentation and latency, as it requires remapping memory into a new contiguous virtual address range. Fragmentation becomes especially problematic in long-running programs and in environments with limited address space, where finding large contiguous regions can be difficult and may lead to premature Out Of Memory Errors (OOME). > > The Mapped Cache uses a self-balancing binary search tree to store memory ranges. Since the ranges are unused when inside the cache, the tree can use this memory to store metadata about itself, referred to as intrusive storage. This approach eliminates the need for dynamic memory allocation (e.g., malloc), which could otherwise introduce a latency overhead. > > # Fragmentation > > Currently, ZGC has multiple strategies for dealing with fragmentation. In some edge cases, these strategies are not as efficient as we would like. By addressing fragmentation differently with the Mapped Cache, ZGC is in a better position to avoid edge cases, which are bad even if they occur only once. This is especially impactful for programs running with a large heap. > > ## Virtual Memory Shuffling > > In addition to the Mapped Cache, we have made some adjustments in how ZGC deals with virtual memory. When harvesting memory, whic... This pull request has now been integrated. Changeset: 7e69b98e Author: Joel Sikstr?m URL: https://git.openjdk.org/jdk/commit/7e69b98e0548803b85b04b518929c073f8ffaf8c Stats: 12052 lines in 118 files changed: 7936 ins; 3218 del; 898 mod 8350441: ZGC: Overhaul Page Allocation Co-authored-by: Axel Boldt-Christmas Co-authored-by: Erik ?sterlund Co-authored-by: Stefan Karlsson Co-authored-by: Stefan Johansson Reviewed-by: stefank, aboldtch, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/24547 From jwaters at openjdk.org Thu Apr 10 15:17:42 2025 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 10 Apr 2025 15:17:42 GMT Subject: RFR: 8342682: Errors related to unused code on Windows after 8339120 in dt_shmem jdwp security and jpackage [v7] In-Reply-To: References: Message-ID: On Mon, 11 Nov 2024 09:51:35 GMT, Julian Waters wrote: >> After 8339120, gcc began catching many different instances of unused code in the Windows specific codebase. Some of these seem to be bugs. I've taken the effort to mark out all the relevant globals and locals that trigger the unused warnings and addressed all of them by commenting out the code as appropriate. I am confident that in many cases this simplistic approach of commenting out code does not fix the underlying issue, and the warning actually found a bug that should be fixed. In these instances, I will be aiming to fix these bugs with help from reviewers, so I recommend anyone reviewing who knows more about the code than I do to see whether there is indeed a bug that needs fixing in a different way than what I did > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'openjdk:master' into unused > - Neater warning silencer in proc_md.h > - Revert _WIN32 workaround in log_messages.c > - Copyright in VersionInfo.cpp > - Remove neutralLangId in VersionInfo.cpp > - Remove global memHandle, would've liked to keep it as a comment :( > - Merge branch 'master' into unused > - 8342682 Keep open ------------- PR Comment: https://git.openjdk.org/jdk/pull/21616#issuecomment-2794181603 From sspitsyn at openjdk.org Thu Apr 10 16:45:39 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 16:45:39 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v3] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 05:50:23 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: minor tweak in two similar comments David, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24539#issuecomment-2794493038 From lmesnik at openjdk.org Thu Apr 10 17:40:25 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 10 Apr 2025 17:40:25 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v3] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 05:50:23 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: minor tweak in two similar comments Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24539#pullrequestreview-2757715386 From sspitsyn at openjdk.org Thu Apr 10 17:59:15 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 17:59:15 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v4] In-Reply-To: References: Message-ID: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: fix a build time error for minimal VM config ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24539/files - new: https://git.openjdk.org/jdk/pull/24539/files/f0b70372..908de6c0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24539&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24539&range=02-03 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24539/head:pull/24539 PR: https://git.openjdk.org/jdk/pull/24539 From sspitsyn at openjdk.org Thu Apr 10 17:59:15 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 17:59:15 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v3] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 05:50:23 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: minor tweak in two similar comments I've pushed a minor update to fix a build time error for minimal VM. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24539#issuecomment-2794701655 From jlu at openjdk.org Thu Apr 10 18:47:53 2025 From: jlu at openjdk.org (Justin Lu) Date: Thu, 10 Apr 2025 18:47:53 GMT Subject: RFR: 8301991: Convert l10n properties resource bundles to UTF-8 native [v2] In-Reply-To: <0q0gTsqIsYtmzAfNYbBXksUXKdZh2uzQ9yvSETKAP88=.137372e6-d63e-4539-b196-4bd9ef1ddd16@github.com> References: <0q0gTsqIsYtmzAfNYbBXksUXKdZh2uzQ9yvSETKAP88=.137372e6-d63e-4539-b196-4bd9ef1ddd16@github.com> Message-ID: <9aQcWun5KNgHgELVwkc3478_RtqfhRL1Cxvyn2Yl0Nw=.07ee596f-e738-4796-8d27-14621ed8860c@github.com> On Thu, 10 Apr 2025 08:44:28 GMT, Eirik Bj?rsn?s wrote: >> Justin Lu has updated the pull request incrementally with one additional commit since the last revision: >> >> Replace InputStreamReader with BufferedReader > > FWIW, I checked out the revision of the commit previous to this change and found the following: > > > % git checkout b55e418a077791b39992042411cde97f68dc39fe^ > % find src -name "*.properties" | xargs file | grep -v ASCII > src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties: > ISO-8859 text > src/java.xml.crypto/share/classes/com/sun/org/apache/xml/internal/security/resource/xmlsecurity_de.properties: > Unicode text, UTF-8 text, with very long lines (322) > > > Which indicates that that this is the only non-ASCII, non-UTF-8 property file. So we may be lucky. This conversion was performed under the assumption of ASCII set and Unicode escape sequences, which is the format we expect for the translation process for .properties files. That file should have been omitted from this change. Thank you @eirbjo and @magicus for the analysis and checking! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15694#issuecomment-2794828598 From lmesnik at openjdk.org Thu Apr 10 20:08:42 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 10 Apr 2025 20:08:42 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v4] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 17:59:15 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix a build time error for minimal VM config Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24539#pullrequestreview-2758233083 From cjplummer at openjdk.org Thu Apr 10 20:12:13 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 10 Apr 2025 20:12:13 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y Message-ID: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. ------------- Commit messages: - Don't use includevthreads=y unless the test requires it. Changes: https://git.openjdk.org/jdk/pull/24583/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24583&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353953 Stats: 16 lines in 4 files changed: 9 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24583/head:pull/24583 PR: https://git.openjdk.org/jdk/pull/24583 From cjplummer at openjdk.org Thu Apr 10 20:31:13 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 10 Apr 2025 20:31:13 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v2] In-Reply-To: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: > Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. > > Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: Fix copyright. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24583/files - new: https://git.openjdk.org/jdk/pull/24583/files/bcb0d8fa..69ce42c3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24583&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24583&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24583/head:pull/24583 PR: https://git.openjdk.org/jdk/pull/24583 From cjplummer at openjdk.org Thu Apr 10 20:32:34 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 10 Apr 2025 20:32:34 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v4] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 17:59:15 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix a build time error for minimal VM config Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24539#pullrequestreview-2758302537 From dholmes at openjdk.org Thu Apr 10 21:02:42 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 10 Apr 2025 21:02:42 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v4] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 17:59:15 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix a build time error for minimal VM config Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24539#pullrequestreview-2758408271 From duke at openjdk.org Thu Apr 10 21:34:30 2025 From: duke at openjdk.org (Hendrik Schick) Date: Thu, 10 Apr 2025 21:34:30 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v2] In-Reply-To: References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: On Thu, 10 Apr 2025 20:31:13 GMT, Chris Plummer wrote: >> Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. >> >> Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright. test/jdk/com/sun/jdi/RedefineNestmateAttr/TestNestmateAttr.java line 1: > 1: /* update copyright to 2025 here aswell? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24583#discussion_r2038371614 From sspitsyn at openjdk.org Thu Apr 10 22:37:27 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 10 Apr 2025 22:37:27 GMT Subject: RFR: 8352773: JVMTI should disable events during java upcalls [v4] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 17:59:15 GMT, Serguei Spitsyn wrote: >> As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. >> The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. >> Some specific implementation details can be added to the first PR comment. >> >> Testing: >> - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): >> - the assert described above is fired if the fix of JDK-8352088 is removed >> - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix a build time error for minimal VM config Chris, thank you for review! Thank you for re-approvals guys! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24539#issuecomment-2795317111 From cjplummer at openjdk.org Thu Apr 10 23:28:24 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 10 Apr 2025 23:28:24 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v3] In-Reply-To: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: > Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. > > Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: Fix copyright. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24583/files - new: https://git.openjdk.org/jdk/pull/24583/files/69ce42c3..8193ccb5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24583&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24583&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24583/head:pull/24583 PR: https://git.openjdk.org/jdk/pull/24583 From cjplummer at openjdk.org Thu Apr 10 23:28:25 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 10 Apr 2025 23:28:25 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v2] In-Reply-To: References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: On Thu, 10 Apr 2025 21:32:05 GMT, Hendrik Schick wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright. > > test/jdk/com/sun/jdi/RedefineNestmateAttr/TestNestmateAttr.java line 1: > >> 1: /* > > update copyright to 2025 here aswell? Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24583#discussion_r2038541169 From sspitsyn at openjdk.org Fri Apr 11 01:28:42 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 11 Apr 2025 01:28:42 GMT Subject: Integrated: 8352773: JVMTI should disable events during java upcalls In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 08:14:04 GMT, Serguei Spitsyn wrote: > As noted in [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088), JVMTI `GetThreadGroupChildren` does an upcall to java. This results in a`ClassPrepare` event the first time it does this, and these events can cause problems (deadlocks) for the debugger or debug agent. The [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088) was fixed to get rid of class loading during Java upcall from `GetThreadGroupChildren`. However, some other events can be generated as well. It is more safe to disable all JVMTI events during debugger-related upcalls originated by JVMTI. > The `ClassPrepare` events are important for the debug agent. So, an assert was added into `ClassPrepare` event generation to make sure there are no attempts to post this event during upcalls. > Some specific implementation details can be added to the first PR comment. > > Testing: > - Verified with the test `jdk/com/sun/jdi/EarlyThreadGroupChildrenTest.java` that was added with the fix of [JDK-8352088](https://bugs.openjdk.org/browse/JDK-8352088): > - the assert described above is fired if the fix of JDK-8352088 is removed > - the test is passed without if the fix of JDK-8352088 is removed and the assert is removed > - Ran mach5 tiers 1-6 This pull request has now been integrated. Changeset: 1c34f3cd Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/1c34f3cdb1df1b9bd01c6795e19a78753e3b555a Stats: 39 lines in 6 files changed: 37 ins; 0 del; 2 mod 8352773: JVMTI should disable events during java upcalls Reviewed-by: lmesnik, dholmes, cjplummer, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/24539 From cjplummer at openjdk.org Fri Apr 11 23:29:38 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 11 Apr 2025 23:29:38 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y Message-ID: This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). ------------- Commit messages: - get rid of commented out change - Don't always use includevirtualthreads=y Changes: https://git.openjdk.org/jdk/pull/24606/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24606&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353955 Stats: 36 lines in 18 files changed: 34 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24606/head:pull/24606 PR: https://git.openjdk.org/jdk/pull/24606 From cjplummer at openjdk.org Fri Apr 11 23:29:38 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 11 Apr 2025 23:29:38 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y In-Reply-To: References: Message-ID: On Fri, 11 Apr 2025 23:23:52 GMT, Chris Plummer wrote: > This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. > > What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. > > includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). Regarding the SerialExecutionDebugger change, some background might help. The following tests all use SerialExecutionDebugger: vmTestbase/nsk/jdi/stress/serial/forceEarlyReturn001/TestDescription.java vmTestbase/nsk/jdi/stress/serial/forceEarlyReturn002/TestDescription.java vmTestbase/nsk/jdi/stress/serial/heapwalking001/TestDescription.java vmTestbase/nsk/jdi/stress/serial/heapwalking002/TestDescription.java vmTestbase/nsk/jdi/stress/serial/mixed001/TestDescription.java vmTestbase/nsk/jdi/stress/serial/mixed002/TestDescription.java vmTestbase/nsk/jdi/stress/serial/monitorEvents001/TestDescription.java vmTestbase/nsk/jdi/stress/serial/monitorEvents002/TestDescription.java vmTestbase/nsk/jdi/stress/serial/ownedMonitorsAndFrames001/TestDescription.java vmTestbase/nsk/jdi/stress/serial/ownedMonitorsAndFrames002/TestDescription.java Each of these tests has a set of existing tests (subtests) that will be executed using the same debugger and debuggee processes. They are executed either in a random order (shuffle) or for a large number of iterations, both with a goal of being somewhat stressful. There is a .tests file that specifies the subtests to run and how to run them. For example, for vmTestbase/nsk/jdi/stress/serial/forceEarlyReturn002 we have: OPTIONS:shuffle nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn001.forceEarlyReturn001 nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn002.forceEarlyReturn002 nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn003.forceEarlyReturn003 nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn004.forceEarlyReturn004 nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn005.forceEarlyReturn005 nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn008.forceEarlyReturn008 nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn013.forceEarlyReturn013 nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn014.forceEarlyReturn014 Some of the subtests requiring launching the debug agent with includevirtualthreads=y. Usually the subtest would be managing this and making sure the debug agent is launched with includevirtualthreads=y. However, for subtests run by the SerialExecutionDebugger, it is the SerialExecutionDebugger class that is responsible launching the debug agent, not the test, and therefore the subtest has no way to enforce its requirement for includevirtualthreads=y. It turns out there is only subtest that needs includevirtualthreads=y: nsk.jdi.ThreadReference.forceEarlyReturn.forceEarlyReturn001.forceEarlyReturn002 You can see the fix for this test in this PR, but as pointed out above, this will only work when these test is run individually, not when run by SerialExecutionDebugger. Since SerialExecutionDebugger launches the debuggee, it needs to know whether or not to specify includevirtualthreads=y. At first I had it just always do that, but it was unnecessary for most of the tests, and reduced test coverage with includevirtualthreads=n. I then opted to add support for passing -includevirtualthreads to SerialExecutionDebugger. That way it's only necessary for the tests that include forceEarlyReturn002. If I want to take this further, I could add an additional @run command that does not use -includevirtualthreads, and also has a separate .test list that excludes forceEarlyReturn002, but I don't think this adds enough benefit to be worth while. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24606#issuecomment-2798204267 From sspitsyn at openjdk.org Tue Apr 15 08:20:45 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 15 Apr 2025 08:20:45 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y In-Reply-To: References: Message-ID: On Fri, 11 Apr 2025 23:23:52 GMT, Chris Plummer wrote: > This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. > > What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. > > includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). Chris, the approach looks okay to me. I do not have a better idea on how to make it less intrusive. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24606#issuecomment-2804220293 From sspitsyn at openjdk.org Tue Apr 15 22:47:42 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 15 Apr 2025 22:47:42 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v3] In-Reply-To: References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: On Thu, 10 Apr 2025 23:28:24 GMT, Chris Plummer wrote: >> Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. >> >> Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright. This looks okay. Are you going to tweak more tests with this flag? ------------- PR Review: https://git.openjdk.org/jdk/pull/24583#pullrequestreview-2770096043 From cjplummer at openjdk.org Tue Apr 15 23:21:43 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 15 Apr 2025 23:21:43 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v3] In-Reply-To: References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: On Tue, 15 Apr 2025 22:44:40 GMT, Serguei Spitsyn wrote: > This looks okay. Are you going to tweak more tests with this flag? No. Only the one test needs includevirtualthreads=y. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24583#issuecomment-2807746541 From amenkov at openjdk.org Tue Apr 15 23:54:20 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 15 Apr 2025 23:54:20 GMT Subject: RFR: 8354461: Update tests to disable streaming output for attach tools Message-ID: The change is a preparation step to enable attach streaming output by default. The fix updates a number of tests which fail with timeout in tier1..tier7 when attach streaming output is enabled. Details in the first comment. Testing: tier1..7 with enabled attach streaming output ------------- Commit messages: - tests Changes: https://git.openjdk.org/jdk/pull/24672/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24672&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354461 Stats: 159 lines in 19 files changed: 44 ins; 56 del; 59 mod Patch: https://git.openjdk.org/jdk/pull/24672.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24672/head:pull/24672 PR: https://git.openjdk.org/jdk/pull/24672 From amenkov at openjdk.org Tue Apr 15 23:54:20 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 15 Apr 2025 23:54:20 GMT Subject: RFR: 8354461: Update tests to disable streaming output for attach tools In-Reply-To: References: Message-ID: On Tue, 15 Apr 2025 23:30:35 GMT, Alex Menkov wrote: > The change is a preparation step to enable attach streaming output by default. > The fix updates a number of tests which fail with timeout in tier1..tier7 when attach streaming output is enabled. > Details in the first comment. > Testing: tier1..7 with enabled attach streaming output The tests use the same scenario: test runs a tool (jcmd/jstack/jmap) to attach to the main test process and redirects the tool output for analysis. We have 2 buffers here: 1. attach channel (socket or pipe depending on platform) - attach operation handler writes, the tool reads. 2. tool redirection buffer - the tool writes, test reads. When attach operation is executes at safepoint, attach streaming output is enabled and the tool output is lengthy we can get both buffers full and the test hung: - attach handler blocks on write operation (tool need to read from the socket/pipe); - the tool reads data from attach channel and blocks writing the data to its stdout (need to read from redirected stream); - test redirector reader cannot read because the VM is at safepoint executing attach operation. To avoid the deadlocks the test are updated to disable attach streaming output (attach operation output is buffered and is sent after the operation is completed). ------------- PR Comment: https://git.openjdk.org/jdk/pull/24672#issuecomment-2807765324 From stuefe at openjdk.org Wed Apr 16 05:45:45 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 16 Apr 2025 05:45:45 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v5] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 13:43:01 GMT, Robert Toyonaga wrote: >> ### Update: >> After some discussion it was decided it's not necessary to expand the lock scope for reserve/commit. Instead, we are opting to add comments explaining the reasons for locking and the conditions to avoid which could lead to races. Some of the new tests can be kept because they are general enough to be useful outside of this context. >> >> ### Summary: >> This PR makes memory operations atomic with NMT accounting. >> >> ### The problem: >> In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. >> >> 1.1 Thread_1 releases range_A. >> 1.2 Thread_1 tells NMT "range_A has been released". >> >> 2.1 Thread_2 reserves (the now free) range_A. >> 2.2 Thread_2 tells NMT "range_A is reserved". >> >> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. >> >> ### Solution: >> Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. >> >> ### Other notes: >> I also simplified this pattern found in many places: >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` >> This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. >> >> I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker:... > > Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: > > improve tests and comments Okay, thank you for doing this. I am fine with this version. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24084#pullrequestreview-2770845493 From duke at openjdk.org Wed Apr 16 09:10:53 2025 From: duke at openjdk.org (duke) Date: Wed, 16 Apr 2025 09:10:53 GMT Subject: Withdrawn: 8340434: Excessive Young GCs Triggered by CodeCache GC Threshold In-Reply-To: References: Message-ID: On Thu, 19 Sep 2024 08:43:50 GMT, sli-x wrote: > The trigger of _codecache_GC_threshold in CodeCache::gc_on_allocation is the key to this problem. > > if (used_ratio > threshold) { > // After threshold is reached, scale it by free_ratio so that more aggressive > // GC is triggered as we approach code cache exhaustion > threshold *= free_ratio; > } > // If code cache has been allocated without any GC at all, let's make sure > // it is eventually invoked to avoid trouble. > if (allocated_since_last_ratio > threshold) { > // In case the GC is concurrent, we make sure only one thread requests the GC. > if (Atomic::cmpxchg(&_unloading_threshold_gc_requested, false, true) == false) { > log_info(codecache)("Triggering threshold (%.3f%%) GC due to allocating %.3f%% since last unloading (%.3f%% used -> %.3f%% used)", > threshold * 100.0, allocated_since_last_ratio * 100.0, last_used_ratio * 100.0, used_ratio * 100.0); > Universe::heap()->collect(GCCause::_codecache_GC_threshold); > } > } > > Here with the limited codecache size, the free_ratio will get lower and lower (so as the threshold) if no methods can be swept and thus leads to a more and more frequent collection behavior. Since the collection happens in stw, the whole performance of gc will also be degraded. > > So a simple solution is to delete the scaling logic here. However, I think here lies some problems worth further exploring. > > There're two options to control a code cache sweeper, StartAggressiveSweepingAt and SweeperThreshold. StartAggressiveSweepingAt is a sweeper triggered for little space in codeCache and does little harm. However, SweeperThreshold, first introduced by [JDK-8244660](https://bugs.openjdk.org/browse/JDK-8244660), was designed for a regular sweep for codecache, when codeCache sweeper and heap collection are actually individual. After [JDK-8290025](https://bugs.openjdk.org/browse/JDK-8290025) and some patches related, the old mechanism of codeCache sweeper is merged into a concurrent heap collection. So the Code cache sweeper heuristics and the unloading behavior will be promised by the concurrent collection. There's no longer any "zombie" methods to be counted. Considering it will introduce lots of useless collection jobs, I think SweeperThreshold should be deleted now. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/21084 From amenkov at openjdk.org Wed Apr 16 23:01:52 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 16 Apr 2025 23:01:52 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v3] In-Reply-To: References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: <3qs7hGYD8kX3tWkU476fS1mqjtb4blZIKExeV1MV6Hk=.1496d532-54c2-4355-87dd-d0d538ff4243@github.com> On Thu, 10 Apr 2025 23:28:24 GMT, Chris Plummer wrote: >> Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. >> >> Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright. Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24583#pullrequestreview-2774074451 From sspitsyn at openjdk.org Wed Apr 16 23:57:41 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 16 Apr 2025 23:57:41 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v3] In-Reply-To: References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: On Thu, 10 Apr 2025 23:28:24 GMT, Chris Plummer wrote: >> Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. >> >> Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright. Looks good. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24583#pullrequestreview-2774151111 From amenkov at openjdk.org Thu Apr 17 00:57:44 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 17 Apr 2025 00:57:44 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y In-Reply-To: References: Message-ID: On Fri, 11 Apr 2025 23:23:52 GMT, Chris Plummer wrote: > This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. > > What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. > > includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). Used approach (call a special method of Binder class) is different from standard way of nsk framework to customize test settings. Standard way assumes settings like this are specified in "@run" as an option (like "-includevirtualthreads=y" or "-includevirtualthreads"), ArgumentHandler parses it (in this case maybe it should be parsed by nsk/share/jpda/DebugeeArgumentHandler.java) and provides a method to get the value, Binder calls the method and sets connector argument. I'm not a fun of this approach, but I think that handling different settings in different ways would make the code even harder to understand ------------- PR Comment: https://git.openjdk.org/jdk/pull/24606#issuecomment-2811367443 From cjplummer at openjdk.org Thu Apr 17 02:07:00 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 17 Apr 2025 02:07:00 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 00:55:10 GMT, Alex Menkov wrote: > Used approach (call a special method of Binder class) is different from standard way of nsk framework to customize test settings. > Standard way assumes settings like this are specified in "@run" as an option (like "-includevirtualthreads=y" or "-includevirtualthreads"), > ArgumentHandler parses it (in this case maybe it should be parsed by nsk/share/jpda/DebugeeArgumentHandler.java) and provides a method to get the value, > Binder calls the method and sets connector argument. > I'm not a fun of this approach, but I think that handling different settings in different ways would make the code even harder to understand I had considered the -includevirtualthreads approach, but I find nsk argument parsing to be very opaque (abstract?), and the argument handling scattered in various places. It was a challenge to figure out how the -includevirtualthreads option would be made visible to Binder. I think with your suggestions I might now have enough info to proceed with that approach, so I'll give it a try. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24606#issuecomment-2811534310 From rvansa at openjdk.org Thu Apr 17 07:11:54 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 17 Apr 2025 07:11:54 GMT Subject: RFR: 8352075: Perf regression accessing fields Message-ID: This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 Iteration through the field stream of several variable-length-encoded items is limited by inherent dependency of fully decoding the field before knowing the position of next field. On the critical codepath addressed by this PR, only the name and signature indices from are used though. Inspired by the Varint-GB and Varint-G81U encodings, the idea of this change is in adding a control stream to the stream of field infos; each field uses one byte of this control stream, and currently the lower 6 bits encode the length of the encoded field, allowing more efficient skipping through the stream. The most significant bit duplicates the injected field flag (name/signature lookup requires that), one bit is currently unused. My measurements on the attached reproducer hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] Range (min ? max): 45.1 ms ? 53.9 ms 100 runs hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] Range (min ? max): 73.8 ms ? 79.7 ms 100 runs (the `jdk25-master` above already contains JDK-8353175) hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' Benchmark 1: /path/to/jdk25-this-pr/bin/java -cp /tmp CCC Time (mean ? ?): 51.8 ms ? 2.1 ms [User: 48.9 ms, System: 16.8 ms] Range (min ? max): 47.4 ms ? 55.1 ms 100 runs So in case of the synthetic reproducer we're already on the level of JDK 17. However, the undisclosed production-grade reproducer still shows regression, so there is still space for optimization. JDK 17: 1.6 s JDK 21 (no patches): 22 s JDK25-master: 12.3 s JDK25-this-pr: 3.1 s About the downsides: This PR increases the consumption by 1 byte for each field in loaded classes. I've executed some tests on a simple Spring Boot application with 6800 instance classes loaded -> about 16k fields in total, with NMT on; the memory usage (Class.Metadata.used) seems to have grown by 32kB; this discrepancy could be investigated later on. Architecturally, I had to remove `const`ness from some methods on `FieldStreamBase`. I think that the memory usage difference is negligible in the practice; if this matters, though, I have implemented two more optimizations in https://github.com/rvansa/jdk/tree/JDK-8352075-dev that should give the byte back, in average. However these optimizations have reduced performance in all of the benchmarks, hence I would prefer not to apply these. ------------- Commit messages: - 8352075: Perf regression accessing fields Changes: https://git.openjdk.org/jdk/pull/24713/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24713&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352075 Stats: 132 lines in 9 files changed: 74 ins; 9 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/24713.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24713/head:pull/24713 PR: https://git.openjdk.org/jdk/pull/24713 From djelinski at openjdk.org Thu Apr 17 12:36:22 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Thu, 17 Apr 2025 12:36:22 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled Message-ID: Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. Tier1-3 testing clean. ------------- Commit messages: - Update copyright - Report more information on error Changes: https://git.openjdk.org/jdk/pull/24722/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24722&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354920 Stats: 21 lines in 2 files changed: 8 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/24722.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24722/head:pull/24722 PR: https://git.openjdk.org/jdk/pull/24722 From rvansa at openjdk.org Thu Apr 17 12:59:56 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 17 Apr 2025 12:59:56 GMT Subject: RFR: 8352075: Perf regression accessing fields [v2] In-Reply-To: References: Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 > > Iteration through the field stream of several variable-length-encoded items is limited by inherent dependency of fully decoding the field before knowing the position of next field. On the critical codepath addressed by this PR, only the name and signature indices from are used though. > Inspired by the Varint-GB and Varint-G81U encodings, the idea of this change is in adding a control stream to the stream of field infos; each field uses one byte of this control stream, and currently the lower 6 bits encode the length of the encoded field, allowing more efficient skipping through the stream. The most significant bit duplicates the injected field flag (name/signature lookup requires that), one bit is currently unused. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the `jdk25-master` above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/bin/java -cp /tmp CCC > Time (mean ? ?): 51.8 ms ? 2.1 ms [User: 48.9 ms, System: 16.8 ms] > Range (min ? max): 47.4 ms ? 55.1 ms 100 runs > > So in case of the synthetic reproducer we're already on the level of JDK 17. However, the undisclosed production-grade reproducer still shows regression, so there is still space for optimization. > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 3.1 s > > > About the downsides: This PR increases the consumption by 1 byte for each field in loaded classes. I've executed some tests on a simple Spring Boot application with 6800 instance classes loaded -> about 16k fields in total, with NMT on; the memory usage (Class.Metadata.used) seems to have grown by 32kB; this discrepancy could be investigated later on. > > Architecturally, I had to remove `const`ness from some methods on ... Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Make skip_bytes private ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24713/files - new: https://git.openjdk.org/jdk/pull/24713/files/69a50374..930e47fd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24713&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24713&range=00-01 Stats: 5 lines in 1 file changed: 1 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24713.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24713/head:pull/24713 PR: https://git.openjdk.org/jdk/pull/24713 From cjplummer at openjdk.org Thu Apr 17 15:20:42 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 17 Apr 2025 15:20:42 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 12:27:55 GMT, Daniel Jeli?ski wrote: > Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. > > This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. > > Tier1-3 testing clean. This is a good start for getting a reasonable error message without having to first enable LIBSAPROC_DEBUG. A couple of suggestions for additional improvements below: You've covered the top level error message in Pgrab_core(), but there are many other print_debug messages that are for errors that are closer to the root cause of the error. For example, look in core_handle_note() and read_lib_info(). Also, Pgrab() needs updating, although it only has one debug message, but it also is missing detecting when add_thread_info() or read_lib_info() fails. add_map_info() and core_handle_prstatus() are lacking a print_error message when they fail. ------------- Changes requested by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24722#pullrequestreview-2776146648 From cjplummer at openjdk.org Thu Apr 17 15:39:55 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 17 Apr 2025 15:39:55 GMT Subject: RFR: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y [v3] In-Reply-To: References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: On Thu, 10 Apr 2025 23:28:24 GMT, Chris Plummer wrote: >> Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. >> >> Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright. Thank you for the reviews Alex and Serguei! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24583#issuecomment-2813349485 From cjplummer at openjdk.org Thu Apr 17 15:39:56 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 17 Apr 2025 15:39:56 GMT Subject: Integrated: 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y In-Reply-To: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> References: <0CcAQV1gET_93X0Z8Fkh9yy9xaEhXC1kGcS4QC90izg=.e29b1fc3-b00c-4f86-92cc-b3060c27de41@github.com> Message-ID: <_wLi62f62ztkbdvMJrPKjfhef2pGnkp6bOH4305hYgI=.53ed5cc3-4049-45b3-bbe8-6c60f60b4f31@github.com> On Thu, 10 Apr 2025 20:06:25 GMT, Chris Plummer wrote: > Don't use includevirtualthreads=y unless the test requires it. Debuggers don't usually use includevirtualthreads=y, so we should be doing most of our testing without it. The only reason tests use it is because some tests need it so they can find virtual threads in the debuggee by using vm.allThreads(). This change limits the use of includevirtualthreads=y to just those com/sun/jdi tests that need it. > > Tested by running com/sun/jdi tests on all supported platforms in both platform threads mode and virtual threads mode. Also tested with tier1 CI. This pull request has now been integrated. Changeset: d1d81dd0 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/d1d81dd01ca6f3fc1e4710e6055c5a3185f43d9a Stats: 17 lines in 4 files changed: 9 ins; 0 del; 8 mod 8353953: con/sun/jdi tests should be fixed to not always require includevirtualthreads=y Reviewed-by: sspitsyn, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24583 From cjplummer at openjdk.org Thu Apr 17 20:35:19 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 17 Apr 2025 20:35:19 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y [v2] In-Reply-To: References: Message-ID: > This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. > > What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. > > includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: Add support for -includevirtualthreads test argument. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24606/files - new: https://git.openjdk.org/jdk/pull/24606/files/a3298d3f..d59e2ee9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24606&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24606&range=00-01 Stats: 74 lines in 32 files changed: 31 ins; 34 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/24606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24606/head:pull/24606 PR: https://git.openjdk.org/jdk/pull/24606 From cjplummer at openjdk.org Thu Apr 17 20:37:41 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 17 Apr 2025 20:37:41 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y In-Reply-To: References: Message-ID: On Fri, 11 Apr 2025 23:23:52 GMT, Chris Plummer wrote: > This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. > > What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. > > includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). Ok, I've switched to using -includevirtualthread as a test argument on the @run command. This approach actually made things simpler for subclasses of TestDebuggerType2, including forceEarlyReturn002 and the SerialExecutionDebugger subclasses. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24606#issuecomment-2813969938 From amenkov at openjdk.org Thu Apr 17 21:47:41 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 17 Apr 2025 21:47:41 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y [v2] In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 20:35:19 GMT, Chris Plummer wrote: >> This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. >> >> What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. >> >> includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Add support for -includevirtualthreads test argument. The update looks good to me ------------- PR Comment: https://git.openjdk.org/jdk/pull/24606#issuecomment-2814082464 From sspitsyn at openjdk.org Fri Apr 18 08:33:41 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 18 Apr 2025 08:33:41 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y [v2] In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 20:35:19 GMT, Chris Plummer wrote: >> This is just a preliminary review. I'd like to get some approval for the approach I'm taking. There are over 300 tests that need to be fixed. I've just fixed a handful in this PR to give a feel for the changes I plan on making. If they look ok to you, then I'll update this PR with the needed changes to the rest of the tests. >> >> What this PR is fixing is the issue with all of our nsk/jdi testing being done with includevirtualthreads=y even though debuggers typically use the default includevirtualthreads=n. As a result we have a testing gap with includevirtualthreads=n. There are nearly 1200 nsk/jdi tests. Only about 350 actually need includevirtualthreads=y. I plan making includevirtualthreads=n the default for nsk/jdi tests unless the test does something to override the default and request includevirtualthreads=y. >> >> includevirtualthreads=y forces the debug agent to track all virtual threads so they are returned by vm.allThreads(). Some tests need this since they use vm.allThreads() to find the debuggee threads. Without includevirtualthreads=y, vm.allThreads() usually won't return any virtual threads (although it might return some for which events have been triggered). > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Add support for -includevirtualthreads test argument. This looks good and somewhat simpler. Some copyright headers need to be updated. Should we expect more test updates in this PR? ------------- PR Review: https://git.openjdk.org/jdk/pull/24606#pullrequestreview-2778082880 From djelinski at openjdk.org Fri Apr 18 09:30:24 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Fri, 18 Apr 2025 09:30:24 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: Message-ID: > Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. > > This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. > > Tier1-3 testing clean. Daniel Jeli?ski has updated the pull request incrementally with eight additional commits since the last revision: - Update copyright - Add more error messages - Add more error messages - Add more error messages - Add more error messages - Add more error messages - Add more error messages - Add more error messages ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24722/files - new: https://git.openjdk.org/jdk/pull/24722/files/08a4731d..a9341c3a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24722&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24722&range=00-01 Stats: 92 lines in 6 files changed: 43 ins; 0 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/24722.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24722/head:pull/24722 PR: https://git.openjdk.org/jdk/pull/24722 From djelinski at openjdk.org Fri Apr 18 09:37:46 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Fri, 18 Apr 2025 09:37:46 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 15:18:11 GMT, Chris Plummer wrote: >> Daniel Jeli?ski has updated the pull request incrementally with eight additional commits since the last revision: >> >> - Update copyright >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages > > This is a good start for getting a reasonable error message without having to first enable LIBSAPROC_DEBUG. A couple of suggestions for additional improvements below: > > You've covered the top level error message in Pgrab_core(), but there are many other print_debug messages that are for errors that are closer to the root cause of the error. For example, look in core_handle_note() and read_lib_info(). Also, Pgrab() needs updating, although it only has one debug message, but it also is missing detecting when add_thread_info() or read_lib_info() fails. > > add_map_info() and core_handle_prstatus() are lacking a print_error message when they fail. Thanks @plummercj for the review! I added a few more print_error messages, converted some printf and print_debug to print_error, and added a new print_debug log to add_class_share_map_info, because the failure of that method is not fatal. Let me know if that looks better. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24722#issuecomment-2815080329 From cjplummer at openjdk.org Fri Apr 18 16:32:49 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 18 Apr 2025 16:32:49 GMT Subject: RFR: 8353955: nsk/jdi tests should be fixed to not always require includevirtualthreads=y [v2] In-Reply-To: References: Message-ID: On Fri, 18 Apr 2025 08:30:44 GMT, Serguei Spitsyn wrote: > Should we expect more test updates in this PR? Yes. This is just a sampling of changes that will be needed to complete this PR. I'll may close it for a while to cut down on the noise while I continue to work on it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24606#issuecomment-2815807308 From cjplummer at openjdk.org Fri Apr 18 18:17:48 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 18 Apr 2025 18:17:48 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: Message-ID: On Fri, 18 Apr 2025 09:30:24 GMT, Daniel Jeli?ski wrote: >> Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. >> >> This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. >> >> Tier1-3 testing clean. > > Daniel Jeli?ski has updated the pull request incrementally with eight additional commits since the last revision: > > - Update copyright > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages Overall this looks great. Something I've wanted to do for a long time now but have never gotten around to. There are some places where I think you end up doubling up on error messages. For example, add_map_info() now prints an error message and so do its callers, but that's probably ok. Please run all the SA tests in test/hotspot/jtreg/serviceability/sa and test/jdk/sun/tools/jhsdb on all our supported platforms. Also test by hand attaching to both a core file and a process just to make sure you haven't turned any expected debug messages into error messages. Thanks! ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24722#pullrequestreview-2779161259 From lmesnik at openjdk.org Mon Apr 21 19:27:41 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 21 Apr 2025 19:27:41 GMT Subject: RFR: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication In-Reply-To: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: On Sat, 19 Apr 2025 05:19:22 GMT, Chris Plummer wrote: > Remove the need for many nsk/jdi tests to discover the main thread, resulting in the test needing to be run with includevirtualthreads=y. Details in first comment. > > Tested with all tier2, tier3, and tier5 svc tests Good to see more simplification of common code patterns. See my small comment inline. Also, you need to update a copyrights for changed files. test/hotspot/jtreg/vmTestbase/nsk/share/jdi/JDIBase.java line 122: > 120: breakpRequest = eventRManager.createBreakpointRequest(lineLocation); > 121: breakpRequest.putProperty("number", property); > 122: if (thread != null) { The change might hide failure if thread is not set in the test. I would prefer to have private settingBreakpoint(ThreadReference thread,.,.) that allows null and protected final BreakpointRequest settingBreakpoint(ThreadReference thread, ReferenceType testedClass, String methodName, String bpLine, String property) that still fails early if thread is null. (I think now it should fail in ` breakpRequest.addThreadFilter(thread); ` string. So for any test that don't have proper thread - test fails early when setting breakpoint and not because it hasn't find it. ------------- Changes requested by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24768#pullrequestreview-2782060944 PR Review Comment: https://git.openjdk.org/jdk/pull/24768#discussion_r2052889087 From cjplummer at openjdk.org Mon Apr 21 20:03:28 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 21 Apr 2025 20:03:28 GMT Subject: RFR: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication [v2] In-Reply-To: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: > Remove the need for many nsk/jdi tests to discover the main thread, resulting in the test needing to be run with includevirtualthreads=y. Details in first comment. > > Tested with all tier2, tier3, and tier5 svc tests Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: Add new settingBreakpoint() API for when thread is null ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24768/files - new: https://git.openjdk.org/jdk/pull/24768/files/934cde70..2b019b9d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24768&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24768&range=00-01 Stats: 22 lines in 1 file changed: 21 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24768/head:pull/24768 PR: https://git.openjdk.org/jdk/pull/24768 From cjplummer at openjdk.org Mon Apr 21 20:03:28 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 21 Apr 2025 20:03:28 GMT Subject: RFR: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication [v2] In-Reply-To: References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: On Mon, 21 Apr 2025 19:20:35 GMT, Leonid Mesnik wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> Add new settingBreakpoint() API for when thread is null > > test/hotspot/jtreg/vmTestbase/nsk/share/jdi/JDIBase.java line 122: > >> 120: breakpRequest = eventRManager.createBreakpointRequest(lineLocation); >> 121: breakpRequest.putProperty("number", property); >> 122: if (thread != null) { > > The change might hide failure if thread is not set in the test. > I would prefer to have > private settingBreakpoint(ThreadReference thread,.,.) that allows null > and > > protected final BreakpointRequest settingBreakpoint(ThreadReference thread, > ReferenceType testedClass, > String methodName, > String bpLine, > String property) > > that still fails early if thread is null. (I think now it should fail in > ` > breakpRequest.addThreadFilter(thread); > ` > string. > So for any test that don't have proper thread - test fails early when setting breakpoint and not because it hasn't find it. Ok. I updated to use a different API that implies no thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24768#discussion_r2052929415 From cjplummer at openjdk.org Mon Apr 21 21:58:17 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 21 Apr 2025 21:58:17 GMT Subject: RFR: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication [v3] In-Reply-To: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: > Remove the need for many nsk/jdi tests to discover the main thread, resulting in the test needing to be run with includevirtualthreads=y. Details in first comment. > > Tested with all tier2, tier3, and tier5 svc tests Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: Update copyrights ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24768/files - new: https://git.openjdk.org/jdk/pull/24768/files/2b019b9d..0256b41d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24768&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24768&range=01-02 Stats: 189 lines in 189 files changed: 0 ins; 0 del; 189 mod Patch: https://git.openjdk.org/jdk/pull/24768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24768/head:pull/24768 PR: https://git.openjdk.org/jdk/pull/24768 From kevinw at openjdk.org Tue Apr 22 09:21:53 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 22 Apr 2025 09:21:53 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: Message-ID: <8juvsEKibdh_PFrnwi1td0Ef22SjFp8iVffif7q4hPQ=.1f92c328-0d96-48cd-9b85-b13011bec536@github.com> On Fri, 18 Apr 2025 09:30:24 GMT, Daniel Jeli?ski wrote: >> Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. >> >> This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. >> >> Tier1-3 testing clean. > > Daniel Jeli?ski has updated the pull request incrementally with eight additional commits since the last revision: > > - Update copyright > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c line 321: > 319: ELF_PHDR* core_php = NULL; > 320: > 321: if ((phbuf = read_program_header_table(ph->core->core_fd, core_ehdr)) == NULL) { odd double space, an existing problem but nice to remove while we're here! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24722#discussion_r2053712711 From kevinw at openjdk.org Tue Apr 22 09:31:54 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 22 Apr 2025 09:31:54 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: Message-ID: <629bfxW7GyGhNwB7wreZckcLXeWCSukQoRPt_zDi6LM=.640f2be7-0ef7-410e-bb77-8bfd062f4ff0@github.com> On Fri, 18 Apr 2025 09:30:24 GMT, Daniel Jeli?ski wrote: >> Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. >> >> This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. >> >> Tier1-3 testing clean. > > Daniel Jeli?ski has updated the pull request incrementally with eight additional commits since the last revision: > > - Update copyright > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages > - Add more error messages src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c line 450: > 448: if ( (ph = (struct ps_prochandle*) calloc(1, sizeof(struct ps_prochandle))) == NULL) { > 449: snprintf(err_buf, err_buf_len, "can't allocate memory for ps_prochandle"); > 450: print_error("%s\n", err_buf); This might be one that would print it twice - the caller in LinuxDebuggerLocal.cpp will print what's in err_buf. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24722#discussion_r2053731498 From kevinw at openjdk.org Tue Apr 22 09:49:49 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 22 Apr 2025 09:49:49 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: <629bfxW7GyGhNwB7wreZckcLXeWCSukQoRPt_zDi6LM=.640f2be7-0ef7-410e-bb77-8bfd062f4ff0@github.com> References: <629bfxW7GyGhNwB7wreZckcLXeWCSukQoRPt_zDi6LM=.640f2be7-0ef7-410e-bb77-8bfd062f4ff0@github.com> Message-ID: On Tue, 22 Apr 2025 09:28:52 GMT, Kevin Walls wrote: >> Daniel Jeli?ski has updated the pull request incrementally with eight additional commits since the last revision: >> >> - Update copyright >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages >> - Add more error messages > > src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c line 450: > >> 448: if ( (ph = (struct ps_prochandle*) calloc(1, sizeof(struct ps_prochandle))) == NULL) { >> 449: snprintf(err_buf, err_buf_len, "can't allocate memory for ps_prochandle"); >> 450: print_error("%s\n", err_buf); > > This might be one that would print it twice - the caller in LinuxDebuggerLocal.cpp will print what's in err_buf. Looks like here in Pgrab the intent is to snprintf an error message in err_buf, and the caller prints it. So Pgrab should not need the print_error calls. Some functions which are called, like read_lib_info, may print their own specific error message, then Pgrab stores an error that "failed to read lib info".. These are reporting the same thing but I don't see that as a problem or duplication, it could be really useful to track where things fail. Yes it's a little annoying that Pgrab_core is different, it doesn't take an error_buf so needs to print_error about any problems. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24722#discussion_r2053763445 From duke at openjdk.org Tue Apr 22 12:59:48 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Tue, 22 Apr 2025 12:59:48 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v5] In-Reply-To: References: Message-ID: On Wed, 9 Apr 2025 13:43:01 GMT, Robert Toyonaga wrote: >> ### Update: >> After some discussion it was decided it's not necessary to expand the lock scope for reserve/commit. Instead, we are opting to add comments explaining the reasons for locking and the conditions to avoid which could lead to races. Some of the new tests can be kept because they are general enough to be useful outside of this context. >> >> ### Summary: >> This PR makes memory operations atomic with NMT accounting. >> >> ### The problem: >> In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. >> >> 1.1 Thread_1 releases range_A. >> 1.2 Thread_1 tells NMT "range_A has been released". >> >> 2.1 Thread_2 reserves (the now free) range_A. >> 2.2 Thread_2 tells NMT "range_A is reserved". >> >> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. >> >> ### Solution: >> Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. >> >> ### Other notes: >> I also simplified this pattern found in many places: >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` >> This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. >> >> I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker:... > > Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: > > improve tests and comments Thank you everyone for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2821246755 From duke at openjdk.org Tue Apr 22 12:59:48 2025 From: duke at openjdk.org (duke) Date: Tue, 22 Apr 2025 12:59:48 GMT Subject: RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock [v5] In-Reply-To: References: Message-ID: <0Cp3I07oKDZOqy364yOF4eCumeOvWsUJl1HT4NuRvOg=.042a81a2-f823-4459-90b3-ee611c5bbe03@github.com> On Wed, 9 Apr 2025 13:43:01 GMT, Robert Toyonaga wrote: >> ### Update: >> After some discussion it was decided it's not necessary to expand the lock scope for reserve/commit. Instead, we are opting to add comments explaining the reasons for locking and the conditions to avoid which could lead to races. Some of the new tests can be kept because they are general enough to be useful outside of this context. >> >> ### Summary: >> This PR makes memory operations atomic with NMT accounting. >> >> ### The problem: >> In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. >> >> 1.1 Thread_1 releases range_A. >> 1.2 Thread_1 tells NMT "range_A has been released". >> >> 2.1 Thread_2 reserves (the now free) range_A. >> 2.2 Thread_2 tells NMT "range_A is reserved". >> >> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. >> >> ### Solution: >> Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. >> >> ### Other notes: >> I also simplified this pattern found in many places: >> >> if (MemTracker::enabled()) { >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_some_operation(addr, bytes); >> if (result != nullptr) { >> MemTracker::record_some_operation(addr, bytes); >> } >> } else { >> result = pd_unmap_memory(addr, bytes); >> } >> ``` >> To: >> >> MemTracker::NmtVirtualMemoryLocker nvml; >> result = pd_unmap_memory(addr, bytes); >> MemTracker::record_some_operation(addr, bytes); >> ``` >> This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. >> >> I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker:... > > Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: > > improve tests and comments @roberttoyonaga Your change (at version 7b7263b2c95571039482a1a62b3e3acaee2b7fcc) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2821249748 From djelinski at openjdk.org Tue Apr 22 16:36:43 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Tue, 22 Apr 2025 16:36:43 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: <629bfxW7GyGhNwB7wreZckcLXeWCSukQoRPt_zDi6LM=.640f2be7-0ef7-410e-bb77-8bfd062f4ff0@github.com> Message-ID: On Tue, 22 Apr 2025 09:47:02 GMT, Kevin Walls wrote: >> src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c line 450: >> >>> 448: if ( (ph = (struct ps_prochandle*) calloc(1, sizeof(struct ps_prochandle))) == NULL) { >>> 449: snprintf(err_buf, err_buf_len, "can't allocate memory for ps_prochandle"); >>> 450: print_error("%s\n", err_buf); >> >> This might be one that would print it twice - the caller in LinuxDebuggerLocal.cpp will print what's in err_buf. > > Looks like here in Pgrab the intent is to snprintf an error message in err_buf, and the caller prints it. So Pgrab should not need the print_error calls. > > Some functions which are called, like read_lib_info, may print their own specific error message, then Pgrab stores an error that "failed to read lib info".. These are reporting the same thing but I don't see that as a problem or duplication, it could be really useful to track where things fail. > > Yes it's a little annoying that Pgrab_core is different, it doesn't take an error_buf so needs to print_error about any problems. ah, interesting. The native caller doesn't print the error message, it throws a Java exception with the message in the exception string, and the Java code then deals with the exception. This is different on MacOS, where the error message is printed, and then an exception with a generic message is thrown. I haven't checked what the Windows implementation does here. The Linux implementation with dynamic exception messages was introduced in 57d8a71115b8fc9a2eb2be876a396c474c207cf3. I'll modify the Linux Pgrab to report all errors through the exception message, and remove print_error from Pgrab. Should I additionally modify the MacOS's Java_sun_jvm_hotspot_debugger_bsd_BsdDebuggerLocal_attach0__I to report all errors through the exception message and remove print_error, or is it OK if the implementations behave differently? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24722#discussion_r2054467779 From kevinw at openjdk.org Tue Apr 22 19:19:41 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 22 Apr 2025 19:19:41 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: <629bfxW7GyGhNwB7wreZckcLXeWCSukQoRPt_zDi6LM=.640f2be7-0ef7-410e-bb77-8bfd062f4ff0@github.com> Message-ID: On Tue, 22 Apr 2025 16:34:07 GMT, Daniel Jeli?ski wrote: > I'll modify the Linux Pgrab to report all errors through the exception message, and remove print_error from Pgrab. That would be great. There is some existing fragmentation/platform differences here, so I don't think you need to make all the implementations the same for this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24722#discussion_r2054715960 From djelinski at openjdk.org Tue Apr 22 21:29:01 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Tue, 22 Apr 2025 21:29:01 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v3] In-Reply-To: References: Message-ID: > Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. > > This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. > > Tier1-3 testing clean. Daniel Jeli?ski has updated the pull request incrementally with two additional commits since the last revision: - Report Pgrab errors in exception message - Remove odd double space ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24722/files - new: https://git.openjdk.org/jdk/pull/24722/files/a9341c3a..1062e68b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24722&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24722&range=01-02 Stats: 10 lines in 2 files changed: 2 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24722.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24722/head:pull/24722 PR: https://git.openjdk.org/jdk/pull/24722 From djelinski at openjdk.org Tue Apr 22 21:33:43 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Tue, 22 Apr 2025 21:33:43 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: <629bfxW7GyGhNwB7wreZckcLXeWCSukQoRPt_zDi6LM=.640f2be7-0ef7-410e-bb77-8bfd062f4ff0@github.com> Message-ID: On Tue, 22 Apr 2025 19:16:39 GMT, Kevin Walls wrote: >> ah, interesting. The native caller doesn't print the error message, it throws a Java exception with the message in the exception string, and the Java code then deals with the exception. >> >> This is different on MacOS, where the error message is printed, and then an exception with a generic message is thrown. I haven't checked what the Windows implementation does here. >> >> The Linux implementation with dynamic exception messages was introduced in 57d8a71115b8fc9a2eb2be876a396c474c207cf3. >> >> I'll modify the Linux Pgrab to report all errors through the exception message, and remove print_error from Pgrab. >> >> Should I additionally modify the MacOS's Java_sun_jvm_hotspot_debugger_bsd_BsdDebuggerLocal_attach0__I to report all errors through the exception message and remove print_error, or is it OK if the implementations behave differently? > >> I'll modify the Linux Pgrab to report all errors through the exception message, and remove print_error from Pgrab. > > That would be great. There is some existing fragmentation/platform differences here, so I don't think you need to make all the implementations the same for this change. I ended up not removing print_error; I found that ptrace_attach was reporting the failures using both print_error and the exception message, so it looked reasonable to use both mechanisms in other places too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24722#discussion_r2054914278 From djelinski at openjdk.org Tue Apr 22 21:40:50 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Tue, 22 Apr 2025 21:40:50 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: Message-ID: On Fri, 18 Apr 2025 18:15:21 GMT, Chris Plummer wrote: > Please run all the SA tests in test/hotspot/jtreg/serviceability/sa and test/jdk/sun/tools/jhsdb on all our supported platforms. Local testing on Linux completed successfully; had to reconfigure ptrace, otherwise many tests were skipped. Mach5 still in progress. > Also test by hand attaching to both a core file and a process just to make sure you haven't turned any expected debug messages into error messages. Done (Linux only, I don't have access to a Mac). I didn't observe any unintended error messages. Also, I only added print_error in places where it guaranteed a failure of Pgrab or Pgrab_core. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24722#issuecomment-2822539612 From amenkov at openjdk.org Tue Apr 22 21:41:51 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 22 Apr 2025 21:41:51 GMT Subject: RFR: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication [v3] In-Reply-To: References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: <2d4EneQDzLQOgfqSdrJM8tMlLRIMa5S7UIsL6ysf3WU=.7532f4dd-8484-4cf9-bc71-7e6caeb624d9@github.com> On Mon, 21 Apr 2025 21:58:17 GMT, Chris Plummer wrote: >> Remove the need for many nsk/jdi tests to discover the main thread, resulting in the test needing to be run with includevirtualthreads=y. Details in first comment. >> >> Tested with all tier2, tier3, and tier5 svc tests > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Update copyrights Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24768#pullrequestreview-2785445453 From lmesnik at openjdk.org Tue Apr 22 21:48:48 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 22 Apr 2025 21:48:48 GMT Subject: RFR: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication [v3] In-Reply-To: References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: On Mon, 21 Apr 2025 21:58:17 GMT, Chris Plummer wrote: >> Remove the need for many nsk/jdi tests to discover the main thread, resulting in the test needing to be run with includevirtualthreads=y. Details in first comment. >> >> Tested with all tier2, tier3, and tier5 svc tests > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Update copyrights Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24768#pullrequestreview-2785458942 From cjplummer at openjdk.org Tue Apr 22 22:04:41 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 22 Apr 2025 22:04:41 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v3] In-Reply-To: References: Message-ID: <4JrRAsydQYpVr4De27FYLcWwEkjDVmvimSl7wolBqmQ=.09061b3c-6758-4cea-a1b5-b1dd49e441df@github.com> On Tue, 22 Apr 2025 21:29:01 GMT, Daniel Jeli?ski wrote: >> Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. >> >> This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. >> >> Tier1-3 testing clean. > > Daniel Jeli?ski has updated the pull request incrementally with two additional commits since the last revision: > > - Report Pgrab errors in exception message > - Remove odd double space Looks good. Thanks for all the added cleanup. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24722#pullrequestreview-2785487091 From kevinw at openjdk.org Tue Apr 22 22:17:41 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 22 Apr 2025 22:17:41 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v3] In-Reply-To: References: Message-ID: On Tue, 22 Apr 2025 21:29:01 GMT, Daniel Jeli?ski wrote: >> Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. >> >> This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. >> >> Tier1-3 testing clean. > > Daniel Jeli?ski has updated the pull request incrementally with two additional commits since the last revision: > > - Report Pgrab errors in exception message > - Remove odd double space Looks good! ------------- Marked as reviewed by kevinw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24722#pullrequestreview-2785506352 From kevinw at openjdk.org Tue Apr 22 22:17:41 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 22 Apr 2025 22:17:41 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v2] In-Reply-To: References: <629bfxW7GyGhNwB7wreZckcLXeWCSukQoRPt_zDi6LM=.640f2be7-0ef7-410e-bb77-8bfd062f4ff0@github.com> Message-ID: On Tue, 22 Apr 2025 21:30:47 GMT, Daniel Jeli?ski wrote: >>> I'll modify the Linux Pgrab to report all errors through the exception message, and remove print_error from Pgrab. >> >> That would be great. There is some existing fragmentation/platform differences here, so I don't think you need to make all the implementations the same for this change. > > I ended up not removing print_error; I found that ptrace_attach was reporting the failures using both print_error and the exception message, so it looked reasonable to use both mechanisms in other places too. ok sounds fair! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24722#discussion_r2054963010 From cjplummer at openjdk.org Tue Apr 22 23:23:56 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 22 Apr 2025 23:23:56 GMT Subject: Integrated: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication In-Reply-To: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: On Sat, 19 Apr 2025 05:19:22 GMT, Chris Plummer wrote: > Remove the need for many nsk/jdi tests to discover the main thread, resulting in the test needing to be run with includevirtualthreads=y. Details in first comment. > > Tested with all tier2, tier3, and tier5 svc tests This pull request has now been integrated. Changeset: b7e8952a Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/b7e8952ad6def4ebae8c8c3c04cf6793f472b029 Stats: 1996 lines in 189 files changed: 73 ins; 1534 del; 389 mod 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication Reviewed-by: lmesnik, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24768 From cjplummer at openjdk.org Tue Apr 22 23:23:55 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 22 Apr 2025 23:23:55 GMT Subject: RFR: 8355071: Fix nsk/jdi test to not require lookup of main thread in order to set the breakpoint used for communication [v3] In-Reply-To: References: <1ypMGnP9qDksi_qbt7eM4rCUaF5aJRqwZW0Zf4Up-cA=.ea153c88-748f-454d-aa00-986ae71694b4@github.com> Message-ID: On Mon, 21 Apr 2025 21:58:17 GMT, Chris Plummer wrote: >> Remove the need for many nsk/jdi tests to discover the main thread, resulting in the test needing to be run with includevirtualthreads=y. Details in first comment. >> >> Tested with all tier2, tier3, and tier5 svc tests > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Update copyrights Thank you for the reviews Leonid and Alex! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24768#issuecomment-2822690892 From cjplummer at openjdk.org Wed Apr 23 03:19:20 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 23 Apr 2025 03:19:20 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass Message-ID: There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): // this is only for this test to get Location object location = lineLocation; And then in the test: case 11: log2(".....setting up BreakpointRequest"); eventRequest1 = eventRManager.createBreakpointRequest(location); break; However, now JDIBase.settingBreakpoint() now contains: lineLocation = (Location) alllineLocations.get(n); breakpLocation = lineLocation; So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: BreakpointRequest bpRequest = settingBreakpoint(mainThread, debuggeeClass, bPointMethod, lineForComm, "zero"); bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); The test should remove all this replicated code and use JDIBase instead. It should also be updated to use the new JDIBase.setupBreakpointForCommunication() method. ------------- Commit messages: - Use JDIBase class instead of implementing as part of the test. Changes: https://git.openjdk.org/jdk/pull/24811/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24811&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355211 Stats: 154 lines in 1 file changed: 0 ins; 147 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24811.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24811/head:pull/24811 PR: https://git.openjdk.org/jdk/pull/24811 From cjplummer at openjdk.org Wed Apr 23 04:14:51 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 23 Apr 2025 04:14:51 GMT Subject: RFR: 8355214: nsk/jdi/ThreadStartRequest/addThreadFilter/addthreadfilter001.java should use JDIBase superclass Message-ID: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> There is nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communcation breakpoint that the debugger and debuggee used to synchronize with. addthreadfilter001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.breakpointForCommunication() method. } else if (event instanceof ThreadStartEvent) { // It might be the case that while the thread start request was enabled // some threads not related to the test ( e.g. JVMCI threads) were started // and generated thread start events. We ignore these thread start events // and keep waiting for a breakpoint event. ThreadStartEvent tse = (ThreadStartEvent) event; log2("ThreadStartEvent is received while waiting for a breakpoint" + " event, thread: : " + tse.thread().name()); continue; } However, this code seems to predate adding similar code to the JDIBase version of breakpointForCommunication(): if (EventFilters.filtered(event)) { // We filter out spurious ThreadStartEvents continue; } The test now uses JDIBase and gets rid of the replicated code that is already in JDIBase. It is also updated to use the new JDIBase.breakpointForCommunication() method. Note: The code in the test mentions JVMCI and the CR mentions graal, so this test likely originally failed due to the creation of a graal compiler thread. EventFilters.filtered() was not handling this thread name, which can have various names but always contains "Compiler", so I updated it the method to also filter out "Compiler" methods. ------------- Commit messages: - Print a message when ThreadStartEvent is filtered out. - Use JDIBase superclass rather than implementing the same functionality in the test. Changes: https://git.openjdk.org/jdk/pull/24812/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24812&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355214 Stats: 167 lines in 3 files changed: 5 ins; 157 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/24812.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24812/head:pull/24812 PR: https://git.openjdk.org/jdk/pull/24812 From sspitsyn at openjdk.org Wed Apr 23 04:20:41 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 23 Apr 2025 04:20:41 GMT Subject: RFR: 8355069: Allocation::check_out_of_memory() should support CheckUnhandledOops mode [v2] In-Reply-To: References: Message-ID: On Sat, 19 Apr 2025 02:25:33 GMT, Leonid Mesnik wrote: >> The >> CheckUnhandledOops >> cause failure if JvmtiExport::post_resource_exhausted(...) >> is called in >> MemAllocator::Allocation::check_out_of_memory() >> The obj is null so it is not a real bug. >> >> I am fixing it to reduce noise for CheckUnhandledOops mode for jvmti tests execution. >> The vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted002/TestDescription.java >> failed with -XX:+CheckUnhandledOops >> >> If define are unwelcome here, the >> ``` PreserveObj obj_h(_thread, _obj_ptr);``` >> might be added instead with comment why it is needed for null obj. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > typo fixes Looks okay to me. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24766#pullrequestreview-2785885049 From djelinski at openjdk.org Wed Apr 23 05:19:51 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Wed, 23 Apr 2025 05:19:51 GMT Subject: RFR: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled [v3] In-Reply-To: References: Message-ID: On Tue, 22 Apr 2025 21:29:01 GMT, Daniel Jeli?ski wrote: >> Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. >> >> This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. >> >> Tier1-3 testing clean. > > Daniel Jeli?ski has updated the pull request incrementally with two additional commits since the last revision: > > - Report Pgrab errors in exception message > - Remove odd double space Thanks for the reviews! Mach5 testing came back clean. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24722#issuecomment-2823085507 From djelinski at openjdk.org Wed Apr 23 05:19:52 2025 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Wed, 23 Apr 2025 05:19:52 GMT Subject: Integrated: 8354920: SA core file support on Linux only prints error messages when debug logging is enabled In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 12:27:55 GMT, Daniel Jeli?ski wrote: > Currently if loading a core file fails, the diagnostic information provided on different systems is different; on MacOS we produce an error message, while on Linux we only print information if debug logging is enabled. > > This PR adds some new messages on Linux to match MacOSX, and changes some of the diagnostic output to error level instead of debug. Additionally, if opening the core or the exe file fails, the system error message (strerror) is printed. > > Tier1-3 testing clean. This pull request has now been integrated. Changeset: 9a2b425b Author: Daniel Jeli?ski URL: https://git.openjdk.org/jdk/commit/9a2b425b13cc468d8627c1548d1d39015ce17af1 Stats: 114 lines in 6 files changed: 52 ins; 0 del; 62 mod 8354920: SA core file support on Linux only prints error messages when debug logging is enabled Reviewed-by: cjplummer, kevinw ------------- PR: https://git.openjdk.org/jdk/pull/24722 From cjplummer at openjdk.org Wed Apr 23 05:29:17 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 23 Apr 2025 05:29:17 GMT Subject: RFR: 8355221: Get rid of unnecessary override of JDIBase.breakpointForCommunication in nsk/jdi tests Message-ID: The following tests all override the JDIBase.breakpointForCommunication() method, but no longer need too: vmTestbase/nsk/jdi/ClassPrepareRequest/addClassExclusionFilter/filter003.java vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_rt/filter_rt002.java vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_s/filter_s002.java vmTestbase/nsk/jdi/EventRequestManager/classPrepareRequests/clsprepreq002.java vmTestbase/nsk/jdi/EventRequestManager/methodExitRequests/methexitreq002.java vmTestbase/nsk/jdi/MethodEntryRequest/addClassExclusionFilter/filter002.java vmTestbase/nsk/jdi/MethodExitRequest/addClassExclusionFilter/filter002.java The override was to fix [JDK-8203809](https://bugs.openjdk.org/browse/JDK-8203809). It filters out unexpected events due to system threads (namely Graal java compiler threads). However, there is now a JDIBase.breakpointForCommunication() version that does the same, so the override can be removed. The new version takes the debuggeeName argument that the override versions currently use to filter events based on it. ------------- Commit messages: - Remove unneeded breakpointForCommunication() override. Changes: https://git.openjdk.org/jdk/pull/24813/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24813&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355221 Stats: 176 lines in 7 files changed: 0 ins; 169 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24813.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24813/head:pull/24813 PR: https://git.openjdk.org/jdk/pull/24813 From duke at openjdk.org Wed Apr 23 11:55:56 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 23 Apr 2025 11:55:56 GMT Subject: Integrated: 8341491: Reserve and commit memory operations should be protected by NMT lock In-Reply-To: References: Message-ID: On Mon, 17 Mar 2025 16:20:42 GMT, Robert Toyonaga wrote: > ### Update: > After some discussion it was decided it's not necessary to expand the lock scope for reserve/commit. Instead, we are opting to add comments explaining the reasons for locking and the conditions to avoid which could lead to races. Some of the new tests can be kept because they are general enough to be useful outside of this context. > > ### Summary: > This PR makes memory operations atomic with NMT accounting. > > ### The problem: > In memory related functions like `os::commit_memory` and `os::reserve_memory` the OS memory operations are currently done before acquiring the the NMT mutex. And the the virtual memory accounting is done later in `MemTracker`, after the lock has been acquired. Doing the memory operations outside of the lock scope can lead to races. > > 1.1 Thread_1 releases range_A. > 1.2 Thread_1 tells NMT "range_A has been released". > > 2.1 Thread_2 reserves (the now free) range_A. > 2.2 Thread_2 tells NMT "range_A is reserved". > > Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid subsequent calls (release range_A, followed by map range_A). But NMT sees "reserve range_A", "release range_A" and is now out of sync with the OS. > > ### Solution: > Where memory operations such as reserve, commit, or release virtual memory happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both the NMT accounting and the memory operation itself. > > ### Other notes: > I also simplified this pattern found in many places: > > if (MemTracker::enabled()) { > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_some_operation(addr, bytes); > if (result != nullptr) { > MemTracker::record_some_operation(addr, bytes); > } > } else { > result = pd_unmap_memory(addr, bytes); > } > ``` > To: > > MemTracker::NmtVirtualMemoryLocker nvml; > result = pd_unmap_memory(addr, bytes); > MemTracker::record_some_operation(addr, bytes); > ``` > This is possible because `NmtVirtualMemoryLocker` now checks `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks `MemTracker::enabled()` and checks against nullptr. This refactoring previously wasn't possible because `ThreadCritical` was used before https://github.com/openjdk/jdk/pull/22745 introduced `NmtVirtualMemoryLocker`. > > I considered moving the locking and NMT accounting down into platform specific code: Ex. lock around { munmap() + MemTracker::record }. The hope was that this would help reduce the size of the critical section... This pull request has now been integrated. Changeset: 44c5aca5 Author: Robert Toyonaga Committer: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/44c5aca54d1e0aaf0616f77845c5b3b1e2fccf5a Stats: 78 lines in 2 files changed: 78 ins; 0 del; 0 mod 8341491: Reserve and commit memory operations should be protected by NMT lock Reviewed-by: stuefe, stefank ------------- PR: https://git.openjdk.org/jdk/pull/24084 From shade at openjdk.org Wed Apr 23 17:19:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 23 Apr 2025 17:19:14 GMT Subject: RFR: 8355432: Remove CompileTask from SA Message-ID: A lot of SA infrastructure was added to support SA compiler replay with [JDK-7088955](https://bugs.openjdk.org/browse/JDK-7088955). With [JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488), we got rid from the most of it. `CompileTask` seems to be left behind. Nothing uses it in SA now. Now, for Leyden, we want to massage `CompileTask` for better performance and reliability ([JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269)), and keeping `CompileTask` in SA would require us to implement a whole bunch of complicated, but unnecessary code. So, it would be good to purge `CompileTask` from SA. Note that I left the related `vmStructs` definitions, because async-profiler uses those; I think to see which methods current compiler is compiling. That use looks safe, as it polls the task from the already set up ciEnv. Anyway, it is async-profiler's headache to deal with `vmStruct` changes. ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/24832/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24832&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355432 Stats: 73 lines in 1 file changed: 0 ins; 73 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24832.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24832/head:pull/24832 PR: https://git.openjdk.org/jdk/pull/24832 From cjplummer at openjdk.org Wed Apr 23 18:03:43 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 23 Apr 2025 18:03:43 GMT Subject: RFR: 8355432: Remove CompileTask from SA In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 17:14:17 GMT, Aleksey Shipilev wrote: > A lot of SA infrastructure was added to support SA compiler replay with [JDK-7088955](https://bugs.openjdk.org/browse/JDK-7088955). With [JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488), we got rid from the most of it. `CompileTask` seems to be left behind. Nothing uses it in SA now. > > Now, for Leyden, we want to massage `CompileTask` for better performance and reliability ([JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269)), and keeping `CompileTask` in SA would require us to implement a whole bunch of complicated, but unnecessary code. > > So, it would be good to purge `CompileTask` from SA. > > Note that I left the related `vmStructs` definitions, because async-profiler uses those; I think to see which methods current compiler is compiling. That use looks safe, as it polls the task from the already set up ciEnv. async-profiler would need to re-adjust after [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269) makes relevant changes in `vmStructs`. This PR frees us from doing the same thing in SA. Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24832#pullrequestreview-2788288929 From lmesnik at openjdk.org Wed Apr 23 18:10:50 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 23 Apr 2025 18:10:50 GMT Subject: RFR: 8355221: Get rid of unnecessary override of JDIBase.breakpointForCommunication in nsk/jdi tests In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 05:24:50 GMT, Chris Plummer wrote: > The following tests all override the JDIBase.breakpointForCommunication() method, but no longer need too: > > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassExclusionFilter/filter003.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_rt/filter_rt002.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_s/filter_s002.java > vmTestbase/nsk/jdi/EventRequestManager/classPrepareRequests/clsprepreq002.java > vmTestbase/nsk/jdi/EventRequestManager/methodExitRequests/methexitreq002.java > vmTestbase/nsk/jdi/MethodEntryRequest/addClassExclusionFilter/filter002.java > vmTestbase/nsk/jdi/MethodExitRequest/addClassExclusionFilter/filter002.java > > The override was to fix [JDK-8203809](https://bugs.openjdk.org/browse/JDK-8203809). It filters out unexpected events due to system threads (namely Graal java compiler threads). However, there is now a JDIBase.breakpointForCommunication() version that does the same, so the override can be removed. The new version takes the debuggeeName argument that the override versions currently use to filter events based on it. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24813#pullrequestreview-2788305490 From lmesnik at openjdk.org Wed Apr 23 18:11:47 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 23 Apr 2025 18:11:47 GMT Subject: RFR: 8355214: nsk/jdi/ThreadStartRequest/addThreadFilter/addthreadfilter001.java should use JDIBase superclass In-Reply-To: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> References: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> Message-ID: On Wed, 23 Apr 2025 04:10:24 GMT, Chris Plummer wrote: > There is nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communcation breakpoint that the debugger and debuggee used to synchronize with. addthreadfilter001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.breakpointForCommunication() method. > > > } else if (event instanceof ThreadStartEvent) { > // It might be the case that while the thread start request was enabled > // some threads not related to the test ( e.g. JVMCI threads) were started > // and generated thread start events. We ignore these thread start events > // and keep waiting for a breakpoint event. > ThreadStartEvent tse = (ThreadStartEvent) event; > log2("ThreadStartEvent is received while waiting for a breakpoint" + > " event, thread: : " + tse.thread().name()); > continue; > } > > > However, this code seems to predate adding similar code to the JDIBase version of breakpointForCommunication(): > > > if (EventFilters.filtered(event)) { > // We filter out spurious ThreadStartEvents > continue; > } > > > The test now uses JDIBase and gets rid of the replicated code that is already in JDIBase. It is also updated to use the new JDIBase.breakpointForCommunication() method. > > Note: The code in the test mentions JVMCI and the CR mentions graal, so this test likely originally failed due to the creation of a graal compiler thread. EventFilters.filtered() was not handling this thread name, which can have various names but always contains "Compiler", so I updated it the method to also filter out "Compiler" methods. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24812#pullrequestreview-2788308908 From lmesnik at openjdk.org Wed Apr 23 18:13:56 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 23 Apr 2025 18:13:56 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 03:14:52 GMT, Chris Plummer wrote: > There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): > > > // this is only for this test to get Location object > location = lineLocation; > > > And then in the test: > > > case 11: > log2(".....setting up BreakpointRequest"); > eventRequest1 = eventRManager.createBreakpointRequest(location); > break; > > > However, now JDIBase.settingBreakpoint() now contains: > > > lineLocation = (Location) alllineLocations.get(n); > breakpLocation = lineLocation; > > > So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. > > The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: > > BreakpointRequest bpRequest = settingBreakpoint(mainThread, > debuggeeClass, > bPointMethod, lineForComm, "zero"); > bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); > > The test uses JDIBase now and all the code replicated from JDIBase has been stripped out. It is also updated to use the new JDIBase.setupBreakpointForCommunication() method. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24811#pullrequestreview-2788312319 From lmesnik at openjdk.org Wed Apr 23 18:13:46 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 23 Apr 2025 18:13:46 GMT Subject: RFR: 8355432: Remove CompileTask from SA In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 17:14:17 GMT, Aleksey Shipilev wrote: > A lot of SA infrastructure was added to support SA compiler replay with [JDK-7088955](https://bugs.openjdk.org/browse/JDK-7088955). With [JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488), we got rid from the most of it. `CompileTask` seems to be left behind. Nothing uses it in SA now. > > Now, for Leyden, we want to massage `CompileTask` for better performance and reliability ([JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269)), and keeping `CompileTask` in SA would require us to implement a whole bunch of complicated, but unnecessary code. > > So, it would be good to purge `CompileTask` from SA. > > Note that I left the related `vmStructs` definitions, because async-profiler uses those; I think to see which methods current compiler is compiling. That use looks safe, as it polls the task from the already set up ciEnv. async-profiler would need to re-adjust after [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269) makes relevant changes in `vmStructs`. This PR frees us from doing the same thing in SA. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24832#pullrequestreview-2788313616 From amenkov at openjdk.org Wed Apr 23 19:39:45 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 23 Apr 2025 19:39:45 GMT Subject: RFR: 8355221: Get rid of unnecessary override of JDIBase.breakpointForCommunication in nsk/jdi tests In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 05:24:50 GMT, Chris Plummer wrote: > The following tests all override the JDIBase.breakpointForCommunication() method, but no longer need too: > > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassExclusionFilter/filter003.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_rt/filter_rt002.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_s/filter_s002.java > vmTestbase/nsk/jdi/EventRequestManager/classPrepareRequests/clsprepreq002.java > vmTestbase/nsk/jdi/EventRequestManager/methodExitRequests/methexitreq002.java > vmTestbase/nsk/jdi/MethodEntryRequest/addClassExclusionFilter/filter002.java > vmTestbase/nsk/jdi/MethodExitRequest/addClassExclusionFilter/filter002.java > > The override was to fix [JDK-8203809](https://bugs.openjdk.org/browse/JDK-8203809). It filters out unexpected events due to system threads (namely Graal java compiler threads). However, there is now a JDIBase.breakpointForCommunication() version that does the same, so the override can be removed. The new version takes the debuggeeName argument that the override versions currently use to filter events based on it. Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24813#pullrequestreview-2788515261 From amenkov at openjdk.org Wed Apr 23 20:03:51 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 23 Apr 2025 20:03:51 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 03:14:52 GMT, Chris Plummer wrote: > There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): > > > // this is only for this test to get Location object > location = lineLocation; > > > And then in the test: > > > case 11: > log2(".....setting up BreakpointRequest"); > eventRequest1 = eventRManager.createBreakpointRequest(location); > break; > > > However, now JDIBase.settingBreakpoint() now contains: > > > lineLocation = (Location) alllineLocations.get(n); > breakpLocation = lineLocation; > > > So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. > > The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: > > BreakpointRequest bpRequest = settingBreakpoint(mainThread, > debuggeeClass, > bPointMethod, lineForComm, "zero"); > bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); > > The test uses JDIBase now and all the code replicated from JDIBase has been stripped out. It is also updated to use the new JDIBase.setupBreakpointForCommunication() method. test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/disable/disable001.java line 250: > 248: setupBreakpointForCommunication(debuggeeClass); > 249: bpRequest.disable(); > 250: bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); Could you please add a comment that setupBreakpointForCommunication creates request with SUSPEND_EVENT_THREAD policy, need to change it to SUSPEND_ALL ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24811#discussion_r2056795724 From cjplummer at openjdk.org Wed Apr 23 21:12:23 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 23 Apr 2025 21:12:23 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass [v2] In-Reply-To: References: Message-ID: > There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): > > > // this is only for this test to get Location object > location = lineLocation; > > > And then in the test: > > > case 11: > log2(".....setting up BreakpointRequest"); > eventRequest1 = eventRManager.createBreakpointRequest(location); > break; > > > However, now JDIBase.settingBreakpoint() now contains: > > > lineLocation = (Location) alllineLocations.get(n); > breakpLocation = lineLocation; > > > So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. > > The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: > > BreakpointRequest bpRequest = settingBreakpoint(mainThread, > debuggeeClass, > bPointMethod, lineForComm, "zero"); > bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); > > The test uses JDIBase now and all the code replicated from JDIBase has been stripped out. It is also updated to use the new JDIBase.setupBreakpointForCommunication() method. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: Add comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24811/files - new: https://git.openjdk.org/jdk/pull/24811/files/4531a969..6376fadc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24811&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24811&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24811.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24811/head:pull/24811 PR: https://git.openjdk.org/jdk/pull/24811 From cjplummer at openjdk.org Wed Apr 23 21:12:24 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 23 Apr 2025 21:12:24 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass [v2] In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 20:01:00 GMT, Alex Menkov wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> Add comment > > test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/disable/disable001.java line 250: > >> 248: setupBreakpointForCommunication(debuggeeClass); >> 249: bpRequest.disable(); >> 250: bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); > > Could you please add a comment that setupBreakpointForCommunication creates request with SUSPEND_EVENT_THREAD policy, need to change it to SUSPEND_ALL Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24811#discussion_r2056875088 From duke at openjdk.org Wed Apr 23 21:16:46 2025 From: duke at openjdk.org (Abdelhak Zaaim) Date: Wed, 23 Apr 2025 21:16:46 GMT Subject: RFR: 8355221: Get rid of unnecessary override of JDIBase.breakpointForCommunication in nsk/jdi tests In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 05:24:50 GMT, Chris Plummer wrote: > The following tests all override the JDIBase.breakpointForCommunication() method, but no longer need too: > > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassExclusionFilter/filter003.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_rt/filter_rt002.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_s/filter_s002.java > vmTestbase/nsk/jdi/EventRequestManager/classPrepareRequests/clsprepreq002.java > vmTestbase/nsk/jdi/EventRequestManager/methodExitRequests/methexitreq002.java > vmTestbase/nsk/jdi/MethodEntryRequest/addClassExclusionFilter/filter002.java > vmTestbase/nsk/jdi/MethodExitRequest/addClassExclusionFilter/filter002.java > > The override was to fix [JDK-8203809](https://bugs.openjdk.org/browse/JDK-8203809). It filters out unexpected events due to system threads (namely Graal java compiler threads). However, there is now a JDIBase.breakpointForCommunication() version that does the same, so the override can be removed. The new version takes the debuggeeName argument that the override versions currently use to filter events based on it. Marked as reviewed by abdelhak-zaaim at github.com (no known OpenJDK username). ------------- PR Review: https://git.openjdk.org/jdk/pull/24813#pullrequestreview-2788698876 From amenkov at openjdk.org Wed Apr 23 21:54:45 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 23 Apr 2025 21:54:45 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass [v2] In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 21:12:23 GMT, Chris Plummer wrote: >> There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): >> >> >> // this is only for this test to get Location object >> location = lineLocation; >> >> >> And then in the test: >> >> >> case 11: >> log2(".....setting up BreakpointRequest"); >> eventRequest1 = eventRManager.createBreakpointRequest(location); >> break; >> >> >> However, now JDIBase.settingBreakpoint() now contains: >> >> >> lineLocation = (Location) alllineLocations.get(n); >> breakpLocation = lineLocation; >> >> >> So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. >> >> The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: >> >> BreakpointRequest bpRequest = settingBreakpoint(mainThread, >> debuggeeClass, >> bPointMethod, lineForComm, "zero"); >> bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >> >> The test uses JDIBase now and all the code replicated from JDIBase has been stripped out. It is also updated to use the new JDIBase.setupBreakpointForCommunication() method. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24811#pullrequestreview-2788756114 From jiangli at openjdk.org Wed Apr 23 22:46:20 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 23 Apr 2025 22:46:20 GMT Subject: RFR: 8355439: Some hotspot/jtreg/serviceability/sa/* tests fail on static JDK due to explicit checks for shared libraries in process memory map Message-ID: Please review the change for following hotspot/jtreg/serviceability/sa/* tests to not check for JDK/VM shared libraries when running on static JDK. - test/hotspot/jtreg/serviceability/sa/ClhsdbPmap.java - test/hotspot/jtreg/serviceability/sa/sadebugd/PmapOnDebugdTest.java - test/hotspot/jtreg/serviceability/sa/sadebugd/RunCommandOnServerTest.java I tested the change on a `static-jdk` binary with a `static-jdk/bin/jhsdb` symlink to the `jhsdb` in a regular JDK binary. The tests pass with the change. Example test command: $ make test TEST="serviceability/sa/sadebugd/RunCommandOnServerTest.java" JDK_UNDER_TEST=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/static-jdk JDK_FOR_COMPILE=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/jdk ------------- Commit messages: - Don't check for JDK/VM shared libraries if Platform.isStatic() is true. Changes: https://git.openjdk.org/jdk/pull/24836/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24836&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355439 Stats: 22 lines in 3 files changed: 16 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24836.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24836/head:pull/24836 PR: https://git.openjdk.org/jdk/pull/24836 From lmesnik at openjdk.org Thu Apr 24 01:38:45 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 24 Apr 2025 01:38:45 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v4] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 06:41:42 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added general comment about sync between suspend_thread and resume_thread Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24269#pullrequestreview-2789116075 From cjplummer at openjdk.org Thu Apr 24 05:33:22 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 24 Apr 2025 05:33:22 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly Message-ID: nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately throw an exception. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. Tested by running all of nsk/jdi locally on linux-x64-debug ------------- Commit messages: - Throw exception when EventQueue.remove() times out Changes: https://git.openjdk.org/jdk/pull/24839/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24839&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355453 Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24839.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24839/head:pull/24839 PR: https://git.openjdk.org/jdk/pull/24839 From kevinw at openjdk.org Thu Apr 24 09:11:31 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 24 Apr 2025 09:11:31 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 Message-ID: This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. Whlie updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. ------------- Commit messages: - JDK-8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 Changes: https://git.openjdk.org/jdk/pull/24843/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318730 Stats: 10 lines in 1 file changed: 5 ins; 3 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24843.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24843/head:pull/24843 PR: https://git.openjdk.org/jdk/pull/24843 From rvansa at openjdk.org Thu Apr 24 10:43:20 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 24 Apr 2025 10:43:20 GMT Subject: RFR: 8352075: Perf regression accessing fields Message-ID: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. My measurements on the attached reproducer hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] Range (min ? max): 45.1 ms ? 53.9 ms 100 runs hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] Range (min ? max): 73.8 ms ? 79.7 ms 100 runs (the jdk25-master above already contains JDK-8353175) hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] Range (min ? max): 37.7 ms ? 42.1 ms 100 runs While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: JDK 17: 1.6 s JDK 21 (no patches): 22 s JDK25-master: 12.3 s JDK25-this-pr: 0.5 s ------------- Commit messages: - 8352075: Perf regression accessing fields Changes: https://git.openjdk.org/jdk/pull/24847/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352075 Stats: 208 lines in 14 files changed: 151 ins; 13 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Thu Apr 24 10:44:49 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 24 Apr 2025 10:44:49 GMT Subject: Withdrawn: 8352075: Perf regression accessing fields In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 07:07:55 GMT, Radim Vansa wrote: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 > > Iteration through the field stream of several variable-length-encoded items is limited by inherent dependency of fully decoding the field before knowing the position of next field. On the critical codepath addressed by this PR, only the name and signature indices from are used though. > Inspired by the Varint-GB and Varint-G81U encodings, the idea of this change is in adding a control stream to the stream of field infos; each field uses one byte of this control stream, and currently the lower 6 bits encode the length of the encoded field, allowing more efficient skipping through the stream. The most significant bit duplicates the injected field flag (name/signature lookup requires that), one bit is currently unused. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the `jdk25-master` above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/bin/java -cp /tmp CCC > Time (mean ? ?): 51.8 ms ? 2.1 ms [User: 48.9 ms, System: 16.8 ms] > Range (min ? max): 47.4 ms ? 55.1 ms 100 runs > > So in case of the synthetic reproducer we're already on the level of JDK 17. However, the undisclosed production-grade reproducer still shows regression, so there is still space for optimization. > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 3.1 s > > > About the downsides: This PR increases the consumption by 1 byte for each field in loaded classes. I've executed some tests on a simple Spring Boot application with 6800 instance classes loaded -> about 16k fields in total, with NMT on; the memory usage (Class.Metadata.used) seems to have grown by 32kB; this discrepancy could be investigated later on. > > Architecturally, I had to remove `const`ness from some methods on ... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24713 From rvansa at openjdk.org Thu Apr 24 10:44:49 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 24 Apr 2025 10:44:49 GMT Subject: RFR: 8352075: Perf regression accessing fields [v2] In-Reply-To: References: Message-ID: On Thu, 17 Apr 2025 12:59:56 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 >> >> Iteration through the field stream of several variable-length-encoded items is limited by inherent dependency of fully decoding the field before knowing the position of next field. On the critical codepath addressed by this PR, only the name and signature indices from are used though. >> Inspired by the Varint-GB and Varint-G81U encodings, the idea of this change is in adding a control stream to the stream of field infos; each field uses one byte of this control stream, and currently the lower 6 bits encode the length of the encoded field, allowing more efficient skipping through the stream. The most significant bit duplicates the injected field flag (name/signature lookup requires that), one bit is currently unused. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the `jdk25-master` above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.8 ms ? 2.1 ms [User: 48.9 ms, System: 16.8 ms] >> Range (min ? max): 47.4 ms ? 55.1 ms 100 runs >> >> So in case of the synthetic reproducer we're already on the level of JDK 17. However, the undisclosed production-grade reproducer still shows regression, so there is still space for optimization. >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 3.1 s >> >> >> About the downsides: This PR increases the consumption by 1 byte for each field in loaded classes. I've executed some tests on a simple Spring Boot application with 6800 instance classes loaded -> about 16k fields in total, with NMT on; the memory usage (Class.Metadata.used) seems to have grown by 32kB; this discrepancy could be investigated later on.... > > Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: > > Make skip_bytes private Withdrawing in favor of https://github.com/openjdk/jdk/pull/24847 ------------- PR Comment: https://git.openjdk.org/jdk/pull/24713#issuecomment-2827160936 From rvansa at openjdk.org Thu Apr 24 12:30:05 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Thu, 24 Apr 2025 12:30:05 GMT Subject: RFR: 8352075: Perf regression accessing fields [v2] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with one additional commit since the last revision: Add include for other platforms ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/0a2f968e..fe798710 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From jsjolen at openjdk.org Thu Apr 24 14:06:16 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 24 Apr 2025 14:06:16 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments Message-ID: Hi, I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. Testing: GHA Tier1 ------------- Commit messages: - Oops - Also fix the header declaration! - Can't be null so make it a reference Changes: https://git.openjdk.org/jdk/pull/24848/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24848&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355490 Stats: 39 lines in 2 files changed: 0 ins; 8 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/24848.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24848/head:pull/24848 PR: https://git.openjdk.org/jdk/pull/24848 From shade at openjdk.org Thu Apr 24 14:50:57 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 24 Apr 2025 14:50:57 GMT Subject: RFR: 8355432: Remove CompileTask from SA In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 17:14:17 GMT, Aleksey Shipilev wrote: > A lot of SA infrastructure was added to support SA compiler replay with [JDK-7088955](https://bugs.openjdk.org/browse/JDK-7088955). With [JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488), we got rid from the most of it. `CompileTask` seems to be left behind. Nothing uses it in SA now. > > Now, for Leyden, we want to massage `CompileTask` for better performance and reliability ([JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269)), and keeping `CompileTask` in SA would require us to implement a whole bunch of complicated, but unnecessary code. > > So, it would be good to purge `CompileTask` from SA. > > Note that I left the related `vmStructs` definitions, because async-profiler uses those; I think to see which methods current compiler is compiling. That use looks safe, as it polls the task from the already set up ciEnv. async-profiler would need to re-adjust after [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269) makes relevant changes in `vmStructs`. This PR frees us from doing the same thing in SA. Thanks for reviews! I think we are good to go. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24832#issuecomment-2827914359 From shade at openjdk.org Thu Apr 24 14:50:58 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 24 Apr 2025 14:50:58 GMT Subject: Integrated: 8355432: Remove CompileTask from SA In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 17:14:17 GMT, Aleksey Shipilev wrote: > A lot of SA infrastructure was added to support SA compiler replay with [JDK-7088955](https://bugs.openjdk.org/browse/JDK-7088955). With [JDK-8315488](https://bugs.openjdk.org/browse/JDK-8315488), we got rid from the most of it. `CompileTask` seems to be left behind. Nothing uses it in SA now. > > Now, for Leyden, we want to massage `CompileTask` for better performance and reliability ([JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269)), and keeping `CompileTask` in SA would require us to implement a whole bunch of complicated, but unnecessary code. > > So, it would be good to purge `CompileTask` from SA. > > Note that I left the related `vmStructs` definitions, because async-profiler uses those; I think to see which methods current compiler is compiling. That use looks safe, as it polls the task from the already set up ciEnv. async-profiler would need to re-adjust after [JDK-8231269](https://bugs.openjdk.org/browse/JDK-8231269) makes relevant changes in `vmStructs`. This PR frees us from doing the same thing in SA. This pull request has now been integrated. Changeset: 0edd018a Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/0edd018a48c202a6da4afe80e245799b47000885 Stats: 73 lines in 1 file changed: 0 ins; 73 del; 0 mod 8355432: Remove CompileTask from SA Reviewed-by: cjplummer, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/24832 From lmesnik at openjdk.org Thu Apr 24 15:37:50 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 24 Apr 2025 15:37:50 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 05:28:09 GMT, Chris Plummer wrote: > nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately throw an exception. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. > > Tested by running all of nsk/jdi locally on linux-x64-debug Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24839#pullrequestreview-2791673104 From lmesnik at openjdk.org Thu Apr 24 15:38:55 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 24 Apr 2025 15:38:55 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass [v2] In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 21:12:23 GMT, Chris Plummer wrote: >> There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): >> >> >> // this is only for this test to get Location object >> location = lineLocation; >> >> >> And then in the test: >> >> >> case 11: >> log2(".....setting up BreakpointRequest"); >> eventRequest1 = eventRManager.createBreakpointRequest(location); >> break; >> >> >> However, now JDIBase.settingBreakpoint() now contains: >> >> >> lineLocation = (Location) alllineLocations.get(n); >> breakpLocation = lineLocation; >> >> >> So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. >> >> The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: >> >> BreakpointRequest bpRequest = settingBreakpoint(mainThread, >> debuggeeClass, >> bPointMethod, lineForComm, "zero"); >> bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >> >> The test uses JDIBase now and all the code replicated from JDIBase has been stripped out. It is also updated to use the new JDIBase.setupBreakpointForCommunication() method. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24811#pullrequestreview-2791677682 From lmesnik at openjdk.org Thu Apr 24 15:40:46 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 24 Apr 2025 15:40:46 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 08:53:42 GMT, Kevin Walls wrote: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > Whlie updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24843#pullrequestreview-2791681489 From coleenp at openjdk.org Thu Apr 24 16:04:46 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 24 Apr 2025 16:04:46 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 12:09:36 GMT, Johan Sj?len wrote: > Hi, > > I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. > > Testing: GHA Tier1 Looks fine. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24848#pullrequestreview-2791756299 From cjplummer at openjdk.org Thu Apr 24 17:17:53 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 24 Apr 2025 17:17:53 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 08:53:42 GMT, Kevin Walls wrote: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. test/jdk/sun/jvmstat/monitor/MonitoredVm/MonitorVmStartTerminate.java line 254: > 252: String timeoutFactorText = System.getProperty("test.timeout.factor", "1.0"); > 253: double timeoutFactor = Double.parseDouble(timeoutFactorText); > 254: long timeoutNanos = 1000_000_000L*(long)(REMOVE_TIMEOUT * timeoutFactor); I don't understand the need for this already large timeout, and making it larger (now 2000 seconds instead of 1000). Is it because failures in hasMainArgs() takes time to timeout? If so, I thought attach failures only took 10 seconds to time out, and we are at most retrying 5 times. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24843#discussion_r2058903444 From cjplummer at openjdk.org Thu Apr 24 17:20:52 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 24 Apr 2025 17:20:52 GMT Subject: RFR: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass [v2] In-Reply-To: References: Message-ID: <7kFO-TgcN7KdSWNhOAFsLfoMBn9VcMi7KyWyh3dUwn0=.1705a38e-3819-48bf-8933-a1606315aad5@github.com> On Wed, 23 Apr 2025 21:12:23 GMT, Chris Plummer wrote: >> There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): >> >> >> // this is only for this test to get Location object >> location = lineLocation; >> >> >> And then in the test: >> >> >> case 11: >> log2(".....setting up BreakpointRequest"); >> eventRequest1 = eventRManager.createBreakpointRequest(location); >> break; >> >> >> However, now JDIBase.settingBreakpoint() now contains: >> >> >> lineLocation = (Location) alllineLocations.get(n); >> breakpLocation = lineLocation; >> >> >> So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. >> >> The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: >> >> BreakpointRequest bpRequest = settingBreakpoint(mainThread, >> debuggeeClass, >> bPointMethod, lineForComm, "zero"); >> bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >> >> The test uses JDIBase now and all the code replicated from JDIBase has been stripped out. It is also updated to use the new JDIBase.setupBreakpointForCommunication() method. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Thank you for the reviews Alex and Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24811#issuecomment-2828325235 From cjplummer at openjdk.org Thu Apr 24 17:20:53 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 24 Apr 2025 17:20:53 GMT Subject: Integrated: 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 03:14:52 GMT, Chris Plummer wrote: > There is an nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communication breakpoint that the debugger and debuggee use to synchronize with. disable001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.settingBreakpoint() method. The main change is the presence of the following in settingBreakpoint(): > > > // this is only for this test to get Location object > location = lineLocation; > > > And then in the test: > > > case 11: > log2(".....setting up BreakpointRequest"); > eventRequest1 = eventRManager.createBreakpointRequest(location); > break; > > > However, now JDIBase.settingBreakpoint() now contains: > > > lineLocation = (Location) alllineLocations.get(n); > breakpLocation = lineLocation; > > > So the test can just use breakpLocation instead. This pattern is already replicated in a large number of tests. > > The other difference in the disable001 version of settingBreakpoint() is that is uses SUSPEND_ALL rather than SUSPEND_EVENT_THREAD. This is easily fixed after setting up the breakpoint request: > > BreakpointRequest bpRequest = settingBreakpoint(mainThread, > debuggeeClass, > bPointMethod, lineForComm, "zero"); > bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); > > The test uses JDIBase now and all the code replicated from JDIBase has been stripped out. It is also updated to use the new JDIBase.setupBreakpointForCommunication() method. This pull request has now been integrated. Changeset: 29f10700 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/29f10700e7c76d94db00e48b98a9c6dfedffac0d Stats: 154 lines in 1 file changed: 0 ins; 145 del; 9 mod 8355211: nsk/jdi/EventRequest/disable/disable001.java should use JDIBase superclass Reviewed-by: lmesnik, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24811 From cjplummer at openjdk.org Thu Apr 24 17:21:58 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 24 Apr 2025 17:21:58 GMT Subject: Integrated: 8355221: Get rid of unnecessary override of JDIBase.breakpointForCommunication in nsk/jdi tests In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 05:24:50 GMT, Chris Plummer wrote: > The following tests all override the JDIBase.breakpointForCommunication() method, but no longer need too: > > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassExclusionFilter/filter003.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_rt/filter_rt002.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_s/filter_s002.java > vmTestbase/nsk/jdi/EventRequestManager/classPrepareRequests/clsprepreq002.java > vmTestbase/nsk/jdi/EventRequestManager/methodExitRequests/methexitreq002.java > vmTestbase/nsk/jdi/MethodEntryRequest/addClassExclusionFilter/filter002.java > vmTestbase/nsk/jdi/MethodExitRequest/addClassExclusionFilter/filter002.java > > The override was to fix [JDK-8203809](https://bugs.openjdk.org/browse/JDK-8203809). It filters out unexpected events due to system threads (namely Graal java compiler threads). However, there is now a JDIBase.breakpointForCommunication() version that does the same, so the override can be removed. The new version takes the debuggeeName argument that the override versions currently use to filter events based on it. This pull request has now been integrated. Changeset: 370e6113 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/370e6113de30fd1bc596b5fbf7bd00f97e689f4f Stats: 176 lines in 7 files changed: 0 ins; 169 del; 7 mod 8355221: Get rid of unnecessary override of JDIBase.breakpointForCommunication in nsk/jdi tests Reviewed-by: lmesnik, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24813 From cjplummer at openjdk.org Thu Apr 24 17:21:57 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 24 Apr 2025 17:21:57 GMT Subject: RFR: 8355221: Get rid of unnecessary override of JDIBase.breakpointForCommunication in nsk/jdi tests In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 05:24:50 GMT, Chris Plummer wrote: > The following tests all override the JDIBase.breakpointForCommunication() method, but no longer need too: > > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassExclusionFilter/filter003.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_rt/filter_rt002.java > vmTestbase/nsk/jdi/ClassPrepareRequest/addClassFilter_s/filter_s002.java > vmTestbase/nsk/jdi/EventRequestManager/classPrepareRequests/clsprepreq002.java > vmTestbase/nsk/jdi/EventRequestManager/methodExitRequests/methexitreq002.java > vmTestbase/nsk/jdi/MethodEntryRequest/addClassExclusionFilter/filter002.java > vmTestbase/nsk/jdi/MethodExitRequest/addClassExclusionFilter/filter002.java > > The override was to fix [JDK-8203809](https://bugs.openjdk.org/browse/JDK-8203809). It filters out unexpected events due to system threads (namely Graal java compiler threads). However, there is now a JDIBase.breakpointForCommunication() version that does the same, so the override can be removed. The new version takes the debuggeeName argument that the override versions currently use to filter events based on it. Thank you for the reviews Alex and Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24813#issuecomment-2828326886 From kevinw at openjdk.org Thu Apr 24 19:30:22 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 24 Apr 2025 19:30:22 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v2] In-Reply-To: References: Message-ID: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: Less unnecessary logging of expected exceptions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24843/files - new: https://git.openjdk.org/jdk/pull/24843/files/eed5b6ee..137f6cba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=00-01 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24843.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24843/head:pull/24843 PR: https://git.openjdk.org/jdk/pull/24843 From kevinw at openjdk.org Thu Apr 24 19:30:22 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Thu, 24 Apr 2025 19:30:22 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 08:53:42 GMT, Kevin Walls wrote: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. Update: Additionally, the log can still get huge and truncated due to many many lines like: hasMainArgs(21400): sun.jvmstat.monitor.MonitorException: Could not attach to 21400 ...so, only log such an Exception if it's NOT saying "Could not attach", as they are expected to be very common. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24843#issuecomment-2828646290 From amenkov at openjdk.org Thu Apr 24 23:03:44 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 24 Apr 2025 23:03:44 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 12:09:36 GMT, Johan Sj?len wrote: > Hi, > > I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. > > Testing: GHA Tier1 Please update copyright year in .hpp file ------------- Marked as reviewed by amenkov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24848#pullrequestreview-2792695546 From amenkov at openjdk.org Thu Apr 24 23:05:55 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 24 Apr 2025 23:05:55 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 05:28:09 GMT, Chris Plummer wrote: > nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately throw an exception. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. > > Tested by running all of nsk/jdi locally on linux-x64-debug Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24839#pullrequestreview-2792696947 From cjplummer at openjdk.org Thu Apr 24 23:20:44 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 24 Apr 2025 23:20:44 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 15:34:40 GMT, Leonid Mesnik wrote: >> nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately throw an exception. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. >> >> Tested by running all of nsk/jdi locally on linux-x64-debug > > Marked as reviewed by lmesnik (Reviewer). @lmesnik @alexmenkov I just noticed an inconsistency with my fix. If the code falls out of the while loop, it returns null, whereas the fix I added throws an exception if eventQueue.remove() times out. Initially I had fixed it by returning null here also, but the result of returning null meant an NPE in the caller in the one case I had where this code was timing out. However, that was new code I'm experimenting with that added a waitingEvent() call and didn't check the result, so it got an NPE. It seems that all other callers do check the result (via instanceof), so they would not result in an NPE. So perhaps the better choice here is to return null instead of throwing an exception. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24839#issuecomment-2829055633 From sspitsyn at openjdk.org Fri Apr 25 02:01:50 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 25 Apr 2025 02:01:50 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 12:09:36 GMT, Johan Sj?len wrote: > Hi, > > I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. > > Testing: GHA Tier1 Looks good. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24848#pullrequestreview-2792835010 From sspitsyn at openjdk.org Fri Apr 25 02:23:54 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 25 Apr 2025 02:23:54 GMT Subject: RFR: 8355214: nsk/jdi/ThreadStartRequest/addThreadFilter/addthreadfilter001.java should use JDIBase superclass In-Reply-To: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> References: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> Message-ID: On Wed, 23 Apr 2025 04:10:24 GMT, Chris Plummer wrote: > There is nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communcation breakpoint that the debugger and debuggee used to synchronize with. addthreadfilter001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.breakpointForCommunication() method. > > > } else if (event instanceof ThreadStartEvent) { > // It might be the case that while the thread start request was enabled > // some threads not related to the test ( e.g. JVMCI threads) were started > // and generated thread start events. We ignore these thread start events > // and keep waiting for a breakpoint event. > ThreadStartEvent tse = (ThreadStartEvent) event; > log2("ThreadStartEvent is received while waiting for a breakpoint" + > " event, thread: : " + tse.thread().name()); > continue; > } > > > However, this code seems to predate adding similar code to the JDIBase version of breakpointForCommunication(): > > > if (EventFilters.filtered(event)) { > // We filter out spurious ThreadStartEvents > continue; > } > > > The test now uses JDIBase and gets rid of the replicated code that is already in JDIBase. It is also updated to use the new JDIBase.breakpointForCommunication() method. > > Note: The code in the test mentions JVMCI and the CR mentions graal, so this test likely originally failed due to the creation of a graal compiler thread. EventFilters.filtered() was not handling this thread name, which can have various names but always contains "Compiler", so I updated it the method to also filter out "Compiler" methods. Nice cleanup. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24812#pullrequestreview-2792854528 From sspitsyn at openjdk.org Fri Apr 25 02:28:58 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 25 Apr 2025 02:28:58 GMT Subject: RFR: 8355439: Some hotspot/jtreg/serviceability/sa/* tests fail on static JDK due to explicit checks for shared libraries in process memory map In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 22:40:16 GMT, Jiangli Zhou wrote: > Please review the change for following hotspot/jtreg/serviceability/sa/* tests to not check for JDK/VM shared libraries when running on static JDK. > > - test/hotspot/jtreg/serviceability/sa/ClhsdbPmap.java > - test/hotspot/jtreg/serviceability/sa/sadebugd/PmapOnDebugdTest.java > - test/hotspot/jtreg/serviceability/sa/sadebugd/RunCommandOnServerTest.java > > I tested the change on a `static-jdk` binary with a `static-jdk/bin/jhsdb` symlink to the `jhsdb` in a regular JDK binary. The tests pass with the change. Example test command: > > $ make test TEST="serviceability/sa/sadebugd/RunCommandOnServerTest.java" JDK_UNDER_TEST=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/static-jdk JDK_FOR_COMPILE=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/jdk This looks good. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24836#pullrequestreview-2792866425 From sspitsyn at openjdk.org Fri Apr 25 02:30:51 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 25 Apr 2025 02:30:51 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v4] In-Reply-To: References: Message-ID: <-B85jN4mRr6rGaB78K6LxtfedmBy4CeQ4DB9hkdBd7o=.9a4f6c8b-a25d-4763-aac3-a7027a674dee@github.com> On Thu, 10 Apr 2025 06:41:42 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added general comment about sync between suspend_thread and resume_thread Thank you a lot for review, Leonid! PING: One more review would be nice to have. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2829238947 PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2829239722 From sspitsyn at openjdk.org Fri Apr 25 02:39:53 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 25 Apr 2025 02:39:53 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v2] In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 19:30:22 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > Less unnecessary logging of expected exceptions The fix looks reasonable. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24843#pullrequestreview-2792878582 From sspitsyn at openjdk.org Fri Apr 25 02:45:43 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 25 Apr 2025 02:45:43 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 15:34:40 GMT, Leonid Mesnik wrote: >> nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately throw an exception. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. >> >> Tested by running all of nsk/jdi locally on linux-x64-debug > > Marked as reviewed by lmesnik (Reviewer). > @lmesnik @alexmenkov I just noticed an inconsistency with my fix. If the code falls out of the while loop, it returns null, whereas the fix I added throws an exception if eventQueue.remove() times out. Initially I had fixed it by returning null here also, but the result of returning null meant an NPE in the caller in the one case I had where this code was timing out. However, that was new code I'm experimenting with that added a waitingEvent() call and didn't check the result, so it got an NPE. It seems that all other callers do check the result (via instanceof), so they would not result in an NPE. So perhaps the better choice here is to return null instead of throwing an exception. Is there a need to correct the PR description to reflect the updated approach? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24839#issuecomment-2829252325 From cjplummer at openjdk.org Fri Apr 25 04:34:46 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 04:34:46 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 02:42:01 GMT, Serguei Spitsyn wrote: > Is there a need to correct the PR description to reflect the updated approach? I haven't implement the new approach yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24839#issuecomment-2829348046 From cjplummer at openjdk.org Fri Apr 25 04:36:46 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 04:36:46 GMT Subject: RFR: 8355439: Some hotspot/jtreg/serviceability/sa/* tests fail on static JDK due to explicit checks for shared libraries in process memory map In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 22:40:16 GMT, Jiangli Zhou wrote: > Please review the change for following hotspot/jtreg/serviceability/sa/* tests to not check for JDK/VM shared libraries when running on static JDK. > > - test/hotspot/jtreg/serviceability/sa/ClhsdbPmap.java > - test/hotspot/jtreg/serviceability/sa/sadebugd/PmapOnDebugdTest.java > - test/hotspot/jtreg/serviceability/sa/sadebugd/RunCommandOnServerTest.java > > I tested the change on a `static-jdk` binary with a `static-jdk/bin/jhsdb` symlink to the `jhsdb` in a regular JDK binary. The tests pass with the change. Example test command: > > $ make test TEST="serviceability/sa/sadebugd/RunCommandOnServerTest.java" JDK_UNDER_TEST=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/static-jdk JDK_FOR_COMPILE=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/jdk Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24836#pullrequestreview-2792967984 From cjplummer at openjdk.org Fri Apr 25 06:16:37 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 06:16:37 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class Message-ID: As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. ------------- Commit messages: - update copyrights - Get debuggee main thread from ClassPrepareEvent for debuggee main class. Changes: https://git.openjdk.org/jdk/pull/24867/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24867&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355569 Stats: 162 lines in 36 files changed: 74 ins; 15 del; 73 mod Patch: https://git.openjdk.org/jdk/pull/24867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24867/head:pull/24867 PR: https://git.openjdk.org/jdk/pull/24867 From cjplummer at openjdk.org Fri Apr 25 06:17:38 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 06:17:38 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: References: Message-ID: > nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately return null. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. > > Tested by running all of nsk/jdi locally on linux-x64-debug Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: return null instead of throwing an exception ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24839/files - new: https://git.openjdk.org/jdk/pull/24839/files/7fbefd22..6a7e79ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24839&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24839&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24839.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24839/head:pull/24839 PR: https://git.openjdk.org/jdk/pull/24839 From cjplummer at openjdk.org Fri Apr 25 06:26:50 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 06:26:50 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 06:10:46 GMT, Chris Plummer wrote: > As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. > > Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. test/hotspot/jtreg/vmTestbase/nsk/jdi/Accessible/isPackagePrivate/accipp001.java line 111: > 109: Binder binder = new Binder(argsHandler, logHandler); > 110: > 111: debugee = Debugee.prepareDebugee(argsHandler, logHandler, debugeeName); Adding -vbs to the debugeeName was messing up the changes made in prepareDebuggee(). Adding it is also redundant. If verbose is enabled, -verbose is already passed to the debuggee (-vbs and -verbose are both handled the same). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24867#discussion_r2059632487 From cjplummer at openjdk.org Fri Apr 25 06:31:10 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 06:31:10 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v2] In-Reply-To: References: Message-ID: > As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. > > Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: glean main thread from ClassPrepareEvent ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24867/files - new: https://git.openjdk.org/jdk/pull/24867/files/c467d8c9..060c8d3d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24867&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24867&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24867/head:pull/24867 PR: https://git.openjdk.org/jdk/pull/24867 From jsjolen at openjdk.org Fri Apr 25 09:39:49 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 25 Apr 2025 09:39:49 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments [v2] In-Reply-To: References: Message-ID: > Hi, > > I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. > > Testing: GHA Tier1 Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24848/files - new: https://git.openjdk.org/jdk/pull/24848/files/55d8994d..abf2fc71 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24848&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24848&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24848.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24848/head:pull/24848 PR: https://git.openjdk.org/jdk/pull/24848 From jsjolen at openjdk.org Fri Apr 25 09:39:49 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 25 Apr 2025 09:39:49 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 12:09:36 GMT, Johan Sj?len wrote: > Hi, > > I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. > > Testing: GHA Tier1 Thank you for the reviews, I fixed the copyright. Please reapprove so that I can integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24848#issuecomment-2829897843 From cnorrbin at openjdk.org Fri Apr 25 10:45:10 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Fri, 25 Apr 2025 10:45:10 GMT Subject: RFR: 8241678: Remove PerfData sampling via StatSampler Message-ID: Hi everyone, This change removes the legacy `PerfData` sampling mechanism implemented through the `StatSampler` ? an always-on periodic task that runs every 50ms my default. The sampling feature was originally introduced to collect performance counters and timestamps, but has since seen very little use. For G1/ZGC, the only sampled value is a timestamp (`sun.os.hrt.ticks`). For Serial/Parallel, it also samples some heap space counters, but these are already updated after each GC cycle, making the sampling redundant. With sampling removed, the `PerfDataSamplingInterval` flag becomes obsoleted, as it no longer serves any purpose. The only thing relying on the sampled timestamps is `jstat`: running `jstat -t` prints an extra column with the time since VM start. To preserve this funcitonality, we can calculate the timestamps as an offset from the already existing `sun.rt.createVmBeginTime` instead. ------------- Commit messages: - removed the PerfDataSamplingInterval flag - calculate timestamp in jstat instead of sampling - StatSampler + sampling code removed Changes: https://git.openjdk.org/jdk/pull/24872/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24872&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8241678 Stats: 864 lines in 23 files changed: 158 ins; 652 del; 54 mod Patch: https://git.openjdk.org/jdk/pull/24872.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24872/head:pull/24872 PR: https://git.openjdk.org/jdk/pull/24872 From kevinw at openjdk.org Fri Apr 25 12:25:49 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 25 Apr 2025 12:25:49 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v2] In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 17:15:22 GMT, Chris Plummer wrote: >> Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: >> >> Less unnecessary logging of expected exceptions > > test/jdk/sun/jvmstat/monitor/MonitoredVm/MonitorVmStartTerminate.java line 254: > >> 252: String timeoutFactorText = System.getProperty("test.timeout.factor", "1.0"); >> 253: double timeoutFactor = Double.parseDouble(timeoutFactorText); >> 254: long timeoutNanos = 1000_000_000L*(long)(REMOVE_TIMEOUT * timeoutFactor); > > I don't understand the need for this already large timeout, and making it larger (now 2000 seconds instead of 1000). Is it because failures in hasMainArgs() takes time to timeout? If so, I thought attach failures only took 10 seconds to time out, and we are at most retrying 5 times. Yes two problems with that timeout change, so I'll update it again: (1) I think I misread the 1000 as milliseconds. It just can't be intentionally a 1000 second timeout, 17 minutes, and it does get multiplied by timeout factor (which can be 10 in some runs). I expect it was meant to be 1 second many years ago, as it was multiplied by timeout factor. Even my local run with -Xcomp -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation ..this part usually takes about 1.5 seconds and the test works reliably due to timeout factor getting set to 10. This may also explain some very very long delays when it has failed 8-) Updating! 300 seconds * timeout factor would be very cautious. I think this and the main args fetching timeout work together to make the test fail/timeout should something clearly not be working, and the Xcomp and other slow compiler stress test settings test these choices. (2) This particular timeout which was 1000 seconds x timeout.factor was a big part of why the test filled and truncated its logs when timing out. With this change we don't log about our wait every iteration, just when done. As we aren't logging, we shouldn't fill and truncate the log so it doesn't matter very much how long we wait. But with this hint that it was actually 1000 seconds, it should go down, not up! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24843#discussion_r2060144129 From kevinw at openjdk.org Fri Apr 25 12:38:27 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Fri, 25 Apr 2025 12:38:27 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v3] In-Reply-To: References: Message-ID: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: REMOVE_TIMEOUT_SECONDS ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24843/files - new: https://git.openjdk.org/jdk/pull/24843/files/137f6cba..9423fc01 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24843.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24843/head:pull/24843 PR: https://git.openjdk.org/jdk/pull/24843 From shade at openjdk.org Fri Apr 25 15:23:34 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 25 Apr 2025 15:23:34 GMT Subject: RFR: 8355617: Remove historical debug_only macro in favor of DEBUG_ONLY Message-ID: I noticed in `macros.hpp`: #ifdef ASSERT #define DEBUG_ONLY(code) code #define NOT_DEBUG(code) #define NOT_DEBUG_RETURN /*next token must be ;*/ // Historical. #define debug_only(code) code #else // ASSERT There are 350+ instances of `debug_only`, and 1600+ instances of `DEBUG_ONLY`. We can cleanup `debug_only` and rewrite all uses to "DEBUG_ONLY". I think we can do this in one shot. This was a mechanical rename. I have looked through the `grep -R debug_only src/ | nl`, and there are no hits anymore. Additional testing: - [ ] GHA ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/24879/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24879&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355617 Stats: 371 lines in 119 files changed: 0 ins; 3 del; 368 mod Patch: https://git.openjdk.org/jdk/pull/24879.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24879/head:pull/24879 PR: https://git.openjdk.org/jdk/pull/24879 From cjplummer at openjdk.org Fri Apr 25 16:30:08 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 16:30:08 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v3] In-Reply-To: References: Message-ID: <2eKkbh51jTro6ENnv5g06ZxUkdYp-kLO2b64x2TWAuI=.a6c737c6-70c1-4bfd-b2ab-6762949e55f7@github.com> On Fri, 25 Apr 2025 12:38:27 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > REMOVE_TIMEOUT_SECONDS test/jdk/sun/jvmstat/monitor/MonitoredVm/MonitorVmStartTerminate.java line 78: > 76: private static final int PROCESS_COUNT = 10; > 77: private static final int ARGS_ATTEMPTS = 5; > 78: private static final int REMOVE_TIMEOUT_SECONDS = 300; I'm still not convinced we need a large timeout here. The default timeout is 120 seconds, so jtreg will time out the test before waitForRemoval() times out itself. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24843#discussion_r2060546385 From cjplummer at openjdk.org Fri Apr 25 16:34:03 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 16:34:03 GMT Subject: Integrated: 8355214: nsk/jdi/ThreadStartRequest/addThreadFilter/addthreadfilter001.java should use JDIBase superclass In-Reply-To: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> References: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> Message-ID: On Wed, 23 Apr 2025 04:10:24 GMT, Chris Plummer wrote: > There is nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communcation breakpoint that the debugger and debuggee used to synchronize with. addthreadfilter001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.breakpointForCommunication() method. > > > } else if (event instanceof ThreadStartEvent) { > // It might be the case that while the thread start request was enabled > // some threads not related to the test ( e.g. JVMCI threads) were started > // and generated thread start events. We ignore these thread start events > // and keep waiting for a breakpoint event. > ThreadStartEvent tse = (ThreadStartEvent) event; > log2("ThreadStartEvent is received while waiting for a breakpoint" + > " event, thread: : " + tse.thread().name()); > continue; > } > > > However, this code seems to predate adding similar code to the JDIBase version of breakpointForCommunication(): > > > if (EventFilters.filtered(event)) { > // We filter out spurious ThreadStartEvents > continue; > } > > > The test now uses JDIBase and gets rid of the replicated code that is already in JDIBase. It is also updated to use the new JDIBase.breakpointForCommunication() method. > > Note: The code in the test mentions JVMCI and the CR mentions graal, so this test likely originally failed due to the creation of a graal compiler thread. EventFilters.filtered() was not handling this thread name, which can have various names but always contains "Compiler", so I updated it the method to also filter out "Compiler" methods. This pull request has now been integrated. Changeset: 77f5a246 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/77f5a24648758cb1adc74056ca58f880af4a8e84 Stats: 167 lines in 3 files changed: 5 ins; 157 del; 5 mod 8355214: nsk/jdi/ThreadStartRequest/addThreadFilter/addthreadfilter001.java should use JDIBase superclass Reviewed-by: lmesnik, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/24812 From cjplummer at openjdk.org Fri Apr 25 16:34:03 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 25 Apr 2025 16:34:03 GMT Subject: RFR: 8355214: nsk/jdi/ThreadStartRequest/addThreadFilter/addthreadfilter001.java should use JDIBase superclass In-Reply-To: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> References: <7r6pbOLpG8tdFvMZC3ws_2umcgqFB_e_0m_Za3xQDXU=.1e994b7c-90f3-4f53-bdd5-38d9bf6c266e@github.com> Message-ID: On Wed, 23 Apr 2025 04:10:24 GMT, Chris Plummer wrote: > There is nsk/jdi superclass called JDIBase that many tests inherit from. It provides common functionality like log support, basic event handling, support for setting a breakpoint, and support for the communcation breakpoint that the debugger and debuggee used to synchronize with. addthreadfilter001 does not inherit from JDIBase and instead implements all this functionality in the test. The reason is because it provides a slightly modified version of the JDIBase.breakpointForCommunication() method. > > > } else if (event instanceof ThreadStartEvent) { > // It might be the case that while the thread start request was enabled > // some threads not related to the test ( e.g. JVMCI threads) were started > // and generated thread start events. We ignore these thread start events > // and keep waiting for a breakpoint event. > ThreadStartEvent tse = (ThreadStartEvent) event; > log2("ThreadStartEvent is received while waiting for a breakpoint" + > " event, thread: : " + tse.thread().name()); > continue; > } > > > However, this code seems to predate adding similar code to the JDIBase version of breakpointForCommunication(): > > > if (EventFilters.filtered(event)) { > // We filter out spurious ThreadStartEvents > continue; > } > > > The test now uses JDIBase and gets rid of the replicated code that is already in JDIBase. It is also updated to use the new JDIBase.breakpointForCommunication() method. > > Note: The code in the test mentions JVMCI and the CR mentions graal, so this test likely originally failed due to the creation of a graal compiler thread. EventFilters.filtered() was not handling this thread name, which can have various names but always contains "Compiler", so I updated it the method to also filter out "Compiler" methods. Thank your for the reviews Serguei and Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24812#issuecomment-2830877713 From jiangli at openjdk.org Fri Apr 25 17:14:59 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Fri, 25 Apr 2025 17:14:59 GMT Subject: RFR: 8355439: Some hotspot/jtreg/serviceability/sa/* tests fail on static JDK due to explicit checks for shared libraries in process memory map In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 02:26:11 GMT, Serguei Spitsyn wrote: >> Please review the change for following hotspot/jtreg/serviceability/sa/* tests to not check for JDK/VM shared libraries when running on static JDK. >> >> - test/hotspot/jtreg/serviceability/sa/ClhsdbPmap.java >> - test/hotspot/jtreg/serviceability/sa/sadebugd/PmapOnDebugdTest.java >> - test/hotspot/jtreg/serviceability/sa/sadebugd/RunCommandOnServerTest.java >> >> I tested the change on a `static-jdk` binary with a `static-jdk/bin/jhsdb` symlink to the `jhsdb` in a regular JDK binary. The tests pass with the change. Example test command: >> >> $ make test TEST="serviceability/sa/sadebugd/RunCommandOnServerTest.java" JDK_UNDER_TEST=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/static-jdk JDK_FOR_COMPILE=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/jdk > > This looks good. @sspitsyn @plummercj Thanks for your reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24836#issuecomment-2830977002 From jiangli at openjdk.org Fri Apr 25 17:14:59 2025 From: jiangli at openjdk.org (Jiangli Zhou) Date: Fri, 25 Apr 2025 17:14:59 GMT Subject: Integrated: 8355439: Some hotspot/jtreg/serviceability/sa/* tests fail on static JDK due to explicit checks for shared libraries in process memory map In-Reply-To: References: Message-ID: On Wed, 23 Apr 2025 22:40:16 GMT, Jiangli Zhou wrote: > Please review the change for following hotspot/jtreg/serviceability/sa/* tests to not check for JDK/VM shared libraries when running on static JDK. > > - test/hotspot/jtreg/serviceability/sa/ClhsdbPmap.java > - test/hotspot/jtreg/serviceability/sa/sadebugd/PmapOnDebugdTest.java > - test/hotspot/jtreg/serviceability/sa/sadebugd/RunCommandOnServerTest.java > > I tested the change on a `static-jdk` binary with a `static-jdk/bin/jhsdb` symlink to the `jhsdb` in a regular JDK binary. The tests pass with the change. Example test command: > > $ make test TEST="serviceability/sa/sadebugd/RunCommandOnServerTest.java" JDK_UNDER_TEST=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/static-jdk JDK_FOR_COMPILE=//JDK-8355439/build/linux-x86_64-server-fastdebug/images/jdk This pull request has now been integrated. Changeset: 4b880299 Author: Jiangli Zhou URL: https://git.openjdk.org/jdk/commit/4b880299881c9413038d647123e3b658999c6f8f Stats: 22 lines in 3 files changed: 16 ins; 0 del; 6 mod 8355439: Some hotspot/jtreg/serviceability/sa/* tests fail on static JDK due to explicit checks for shared libraries in process memory map Reviewed-by: sspitsyn, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/24836 From stefank at openjdk.org Fri Apr 25 17:22:49 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 25 Apr 2025 17:22:49 GMT Subject: RFR: 8355617: Remove historical debug_only macro in favor of DEBUG_ONLY In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 15:17:17 GMT, Aleksey Shipilev wrote: > I noticed in `macros.hpp`: > > > #ifdef ASSERT > #define DEBUG_ONLY(code) code > #define NOT_DEBUG(code) > #define NOT_DEBUG_RETURN /*next token must be ;*/ > // Historical. > #define debug_only(code) code > #else // ASSERT > > > There are 350+ instances of `debug_only`, and 1600+ instances of `DEBUG_ONLY`. We can cleanup `debug_only` and rewrite all uses to `DEBUG_ONLY`. > > I think we can do this in one shot. This was a mechanical rename. I have looked through the `grep -R debug_only src/ | nl`, and there are no hits anymore. > > Additional testing: > - [ ] GHA Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24879#pullrequestreview-2794815649 From kbarrett at openjdk.org Fri Apr 25 18:06:14 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 25 Apr 2025 18:06:14 GMT Subject: RFR: 8355617: Remove historical debug_only macro in favor of DEBUG_ONLY In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 15:17:17 GMT, Aleksey Shipilev wrote: > I noticed in `macros.hpp`: > > > #ifdef ASSERT > #define DEBUG_ONLY(code) code > #define NOT_DEBUG(code) > #define NOT_DEBUG_RETURN /*next token must be ;*/ > // Historical. > #define debug_only(code) code > #else // ASSERT > > > There are 350+ instances of `debug_only`, and 1600+ instances of `DEBUG_ONLY`. We can cleanup `debug_only` and rewrite all uses to `DEBUG_ONLY`. > > I think we can do this in one shot. This was a mechanical rename. I have looked through the `grep -R debug_only src/ | nl`, and there are no hits anymore. > > Additional testing: > - [x] GHA Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24879#pullrequestreview-2794899494 From amenkov at openjdk.org Fri Apr 25 18:15:59 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 25 Apr 2025 18:15:59 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments [v2] In-Reply-To: References: Message-ID: <2eEevGRdwBQMpaU3SF4yPLzXmEVCeENCb15IeiyvWCQ=.38bfeeea-4de6-4d62-9646-c59efbb9080b@github.com> On Fri, 25 Apr 2025 09:39:49 GMT, Johan Sj?len wrote: >> Hi, >> >> I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. >> >> Testing: GHA Tier1 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Copyright Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24848#pullrequestreview-2794918781 From iveresov at openjdk.org Fri Apr 25 20:32:28 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Fri, 25 Apr 2025 20:32:28 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling Message-ID: Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 ------------- Commit messages: - Merge branch 'master' into pp - Add AOTCompileEagerly flag to control compilation after clinit - Port 8355334: [leyden] Missing type profile info in archived training data - Port 8355296: [leyden] Some methods are stuck at level=0 with -XX:-TieredCompilation - Use ENABLE_IF macro - Missing part of the last commit - Fix value of CompLevel_count - Add test - Don't compile trainingData.cpp without CDS (part 2) - Don't compile trainingData.cpp without CDS (part 1) - ... and 18 more: https://git.openjdk.org/jdk/compare/5c067232...3ec132e7 Changes: https://git.openjdk.org/jdk/pull/24886/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355003 Stats: 3202 lines in 57 files changed: 2976 ins; 103 del; 123 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From sspitsyn at openjdk.org Fri Apr 25 20:52:48 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 25 Apr 2025 20:52:48 GMT Subject: RFR: 8355490: Make VM_RedefineClasses::merge_constant_pools only take reference arguments [v2] In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 09:39:49 GMT, Johan Sj?len wrote: >> Hi, >> >> I've changed two pointer arguments to be references instead, as we bail if they are null and `assert(false)` on top of that. There are no other calls to this function than the one I fixed. >> >> Testing: GHA Tier1 > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Copyright Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24848#pullrequestreview-2795253529 From ayang at openjdk.org Sat Apr 26 11:33:51 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Sat, 26 Apr 2025 11:33:51 GMT Subject: RFR: 8241678: Remove PerfData sampling via StatSampler In-Reply-To: References: Message-ID: <30koYWQ8Z8s6wId_9EmUxpAmyeBVeOYEP-u-nKzm_OQ=.10a36bef-4f65-4b53-a526-a0dc60b18de9@github.com> On Fri, 25 Apr 2025 10:38:39 GMT, Casper Norrbin wrote: > Hi everyone, > > This change removes the legacy `PerfData` sampling mechanism implemented through the `StatSampler` ? an always-on periodic task that runs every 50ms my default. The sampling feature was originally introduced to collect performance counters and timestamps, but has since seen very little use. > > For G1/ZGC, the only sampled value is a timestamp (`sun.os.hrt.ticks`). For Serial/Parallel, it also samples some heap space counters, but these are already updated after each GC cycle, making the sampling redundant. With sampling removed, the `PerfDataSamplingInterval` flag becomes obsoleted, as it no longer serves any purpose. > > The only thing relying on the sampled timestamps is `jstat`: running `jstat -t` prints an extra column with the time since VM start. To preserve this funcitonality, we can calculate the timestamps as an offset from the already existing `sun.rt.createVmBeginTime` instead. There are still a few matches for "hrt.ticks"; don't know if they should be removed (in this PR or a followup). src/hotspot/share/runtime/arguments.cpp line 538: > 536: { "ZGenerational", JDK_Version::jdk(23), JDK_Version::jdk(24), JDK_Version::undefined() }, > 537: { "ZMarkStackSpaceLimit", JDK_Version::undefined(), JDK_Version::jdk(25), JDK_Version::undefined() }, > 538: { "PerfDataSamplingInterval", JDK_Version::undefined(), JDK_Version::jdk(25), JDK_Version::undefined() }, Since the CSR says it will be removed in 26, the final arg should reflect that, IMO. src/hotspot/share/runtime/perfData.cpp line 419: > 417: */ > 418: void PerfDataManager::assert_system_property(const char* name, const char* value, TRAPS) { > 419: #ifdef ASSERT The indentation seems odd. (I recall `#ifdef` itself should not indented.) src/hotspot/share/runtime/perfData.cpp line 455: > 453: assert(value != nullptr, "property name should be have a value: %s", name); > 454: assert_system_property(name, value, CHECK); > 455: if (value != nullptr) { Why checking null again? Didn't we just asserted that 2 lines above? src/hotspot/share/runtime/threads.cpp line 852: > 850: #endif // INCLUDE_MANAGEMENT > 851: > 852: PerfDataManager::create_misc_perfdata(); Should this be guarded by `UsePerfData`? ------------- PR Review: https://git.openjdk.org/jdk/pull/24872#pullrequestreview-2795935371 PR Review Comment: https://git.openjdk.org/jdk/pull/24872#discussion_r2061262368 PR Review Comment: https://git.openjdk.org/jdk/pull/24872#discussion_r2061263070 PR Review Comment: https://git.openjdk.org/jdk/pull/24872#discussion_r2061263372 PR Review Comment: https://git.openjdk.org/jdk/pull/24872#discussion_r2061264533 From kvn at openjdk.org Sat Apr 26 22:49:52 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 26 Apr 2025 22:49:52 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: Message-ID: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> On Fri, 25 Apr 2025 20:18:41 GMT, Igor Veresov wrote: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 In general it is okay. Please, use UL with `(aot, training)` tags for your code. I noticed (and added comment) that you use `guarantee()` which crashes VM when you call new `verify` methods. Can you disable TD instead and continue execution? src/hotspot/share/cds/archiveBuilder.cpp line 770: > 768: relocate_embedded_pointers(&_rw_src_objs); > 769: relocate_embedded_pointers(&_ro_src_objs); > 770: log_info(cds)("Relocating %zu pointers, %zu tagged, %zu nulled", `log_info(aot)` if it is Leyden related. src/hotspot/share/cds/dumpAllocStats.hpp line 151: > 149: } > 150: > 151: void record_dynamic_proxy_class() { This is not called. This code seems not related. src/hotspot/share/ci/ciInstanceKlass.hpp line 47: > 45: friend class ciField; > 46: friend class ciReplay; > 47: friend class CompileTrainingData; Not referenced here src/hotspot/share/ci/ciMethod.cpp line 1147: > 1145: // heuristic (e.g. post call nop instructions; see InlineSkippedInstructionsCounter) > 1146: int ciMethod::inline_instructions_size() { > 1147: if (_inline_instructions_size == -1) { Why repeat this condition and not put new code under existing one? src/hotspot/share/ci/ciMethodData.cpp line 71: > 69: > 70: bool is_live(Method* m) { > 71: Klass* holder = m->method_holder(); Changes in this file seems not related and can be pushed/tested separately. If they are related - there should be condition for additional checks. src/hotspot/share/ci/ciObjectFactory.hpp line 2: > 1: /* > 2: * Copyright (c) 1999, 2020, Oracle and/or its affiliates. All rights reserved. Wrong year src/hotspot/share/compiler/compileBroker.hpp line 456: > 454: }; > 455: > 456: class TrainingReplayThread : public JavaThread { Add comment explaining what this thread do exactly. src/hotspot/share/memory/allocation.cpp line 89: > 87: } > 88: > 89: // Work-around -- see JDK-8331086 We should not use bug ID in comment (here and in .hpp file). Please, explain in the comment why you do that. src/hotspot/share/oops/methodCounters.cpp line 57: > 55: > 56: MethodCounters::MethodCounters() { > 57: assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); The same comment as for `MathodData()`. src/hotspot/share/oops/methodCounters.cpp line 82: > 80: > 81: void MethodCounters::metaspace_pointers_do(MetaspaceClosure* it) { > 82: log_trace(cds)("Iter(MethodCounters): %p", this); Use `log_trace(aot, training)` src/hotspot/share/oops/methodCounters.hpp line 129: > 127: void set_highest_osr_comp_level(int level) { _highest_osr_comp_level = (u1)level; } > 128: > 129: Extra empty line src/hotspot/share/oops/methodData.cpp line 1296: > 1294: > 1295: MethodData::MethodData() { > 1296: assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); 1. Should its code be guarded by `#if INCLUDE_CDS`? 2. Comment where/how it is used. 3. Is it used in all phases or only during TRAINING and ASSEMBLY? 4. Can you add query methods into `CDSConfig` which you can call here and in other places?: is_dumping_training_data() is_using_training_data() src/hotspot/share/oops/methodData.cpp line 1434: > 1432: > 1433: bool MethodData::is_mature() const { > 1434: return CompilationPolicy::is_mature((MethodData*)this); Why you need the cast? src/hotspot/share/oops/methodData.cpp line 1796: > 1794: > 1795: void MethodData::metaspace_pointers_do(MetaspaceClosure* it) { > 1796: log_trace(cds)("Iter(MethodData): %p for %p %s", this, _method, _method->name_and_sig_as_C_string()); We discussed yesterday and we need to use `aot` instead `cds` for Leyden. `aot` tag is already in mainline. I suggest to use new tag `(aot, training)` to separate your output from general CDS/AOT output. src/hotspot/share/oops/trainingData.cpp line 2: > 1: /* > 2: * Copyright (c) 2023, 2025, Oracle and/or its affiliates. All rights reserved. Use one year src/hotspot/share/oops/trainingData.cpp line 54: > 52: > 53: MethodTrainingData::MethodTrainingData() { > 54: assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); Consider adding and using `CDSConfig::is_dumping_training_data() ` or something. src/hotspot/share/oops/trainingData.cpp line 76: > 74: > 75: static void verify_archived_entry(TrainingData* td, const TrainingData::Key* k) { > 76: guarantee(TrainingData::Key::can_compute_cds_hash(k), ""); Should we gracefully disable using TD instead of crashing VM? src/hotspot/share/oops/trainingData.cpp line 545: > 543: if (is_excluded) { > 544: ResourceMark rm; > 545: log_debug(cds)("Cleanup KTD %s", name()->as_klass_external_name()); `log_debug(aot, training)` src/hotspot/share/oops/trainingData.hpp line 2: > 1: /* > 2: * Copyright (c) 2023, 2025, Oracle and/or its affiliates. All rights reserved. This is debatable but suggestion to use only one year (2025) for new files. src/hotspot/share/oops/trainingData.hpp line 756: > 754: int highest_level() const { return highest_level(_level_mask); } > 755: int highest_top_level() const { return _highest_top_level; } > 756: MethodData* final_profile() const { return _final_profile; } `never_inlined()`, `highest_level()` are not used. src/hotspot/share/runtime/init.cpp line 189: > 187: #endif > 188: > 189: if (TrainingData::have_data() || TrainingData::need_data()) { Why 2 checks? Comment please. test/hotspot/jtreg/runtime/cds/appcds/aotProfile/AOTProfileFlags.java line 30: > 28: * @requires vm.cds > 29: * @comment work around JDK-8345635 > 30: * @requires !vm.jvmci.enabled Consider adding: * @requires vm.cds.supports.aot.class.linking * @requires vm.flagless ------------- Changes requested by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24886#pullrequestreview-2796434540 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061680543 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061639984 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061668601 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061637565 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061635488 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061668094 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061631348 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061629193 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061626756 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061626460 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061626301 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061626021 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061625757 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061625222 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061687468 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061690598 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061694717 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061697540 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061627798 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061664186 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061612509 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061611330 From iveresov at openjdk.org Sat Apr 26 23:38:46 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sat, 26 Apr 2025 23:38:46 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: <__a0S14OZpuwaQX6jFAlJhZJy2fZ9flb_1LW9gUD_sI=.b2b7851a-0aed-421b-9755-89fc63e404da@github.com> On Sat, 26 Apr 2025 21:07:04 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/runtime/init.cpp line 189: > >> 187: #endif >> 188: >> 189: if (TrainingData::have_data() || TrainingData::need_data()) { > > Why 2 checks? Comment please. Because it's only needed if we're recording or replaying. I'll add a comment. > test/hotspot/jtreg/runtime/cds/appcds/aotProfile/AOTProfileFlags.java line 30: > >> 28: * @requires vm.cds >> 29: * @comment work around JDK-8345635 >> 30: * @requires !vm.jvmci.enabled > > Consider adding: > > * @requires vm.cds.supports.aot.class.linking > * @requires vm.flagless Could you please explain why? @iklam, what do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061836271 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061832468 From iveresov at openjdk.org Sat Apr 26 23:46:47 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sat, 26 Apr 2025 23:46:47 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 21:28:12 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/oops/methodData.cpp line 1434: > >> 1432: >> 1433: bool MethodData::is_mature() const { >> 1434: return CompilationPolicy::is_mature((MethodData*)this); > > Why you need the cast? To remove constness. I'll make it a const_cast. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061843635 From iveresov at openjdk.org Sat Apr 26 23:55:49 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sat, 26 Apr 2025 23:55:49 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 22:09:24 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/ci/ciMethodData.cpp line 71: > >> 69: >> 70: bool is_live(Method* m) { >> 71: Klass* holder = m->method_holder(); > > Changes in this file seems not related and can be pushed/tested separately. If they are related - there should be condition for additional checks. You mean you want these checks to be done only if `TrainingData::have_data() == true` ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061846315 From iveresov at openjdk.org Sun Apr 27 00:00:56 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 00:00:56 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 22:42:02 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/oops/trainingData.cpp line 76: > >> 74: >> 75: static void verify_archived_entry(TrainingData* td, const TrainingData::Key* k) { >> 76: guarantee(TrainingData::Key::can_compute_cds_hash(k), ""); > > Should we gracefully disable using TD instead of crashing VM? But this is a verification code. That seems to be the usual strategy, is it not? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061847152 From iveresov at openjdk.org Sun Apr 27 00:05:46 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 00:05:46 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 22:15:27 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/cds/dumpAllocStats.hpp line 151: > >> 149: } >> 150: >> 151: void record_dynamic_proxy_class() { > > This is not called. This code seems not related. True. @iklam, this came with a change you wanted me to take. Ok to cut this out? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061851887 From iveresov at openjdk.org Sun Apr 27 00:23:45 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 00:23:45 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 21:30:25 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/oops/methodData.cpp line 1296: > >> 1294: >> 1295: MethodData::MethodData() { >> 1296: assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); > > 1. Should its code be guarded by `#if INCLUDE_CDS`? > 2. Comment where/how it is used. > 3. Is it used in all phases or only during TRAINING and ASSEMBLY? > 4. Can you add query methods into `CDSConfig` which you can call here and in other places?: > > is_dumping_training_data() > is_using_training_data() I think those are used for CDS serialization/deserialization, right, @iklam? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061856884 From iveresov at openjdk.org Sun Apr 27 00:26:45 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 00:26:45 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 22:32:00 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/ci/ciInstanceKlass.hpp line 47: > >> 45: friend class ciField; >> 46: friend class ciReplay; >> 47: friend class CompileTrainingData; > > Not referenced here It allows `CompileTrainingData` to peek into the `ciInstanceKlass` internals. We need the klass ptr specially. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061864230 From iveresov at openjdk.org Sun Apr 27 00:34:45 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 00:34:45 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 22:11:58 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/ci/ciMethod.cpp line 1147: > >> 1145: // heuristic (e.g. post call nop instructions; see InlineSkippedInstructionsCounter) >> 1146: int ciMethod::inline_instructions_size() { >> 1147: if (_inline_instructions_size == -1) { > > Why repeat this condition and not put new code under existing one? The caching logic sets the `_inline_instructions_size` if the value is found in the archive. The normal logic doesn't need to run if this happens. Something like: if (_value == -1) { _value = get_from_cache(); } if (_value == -1) { // didn't get it from the cache _value = compute_value(); } So basically to delineate the caching logic from the normal path. Does it make sense? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061875051 From iklam at openjdk.org Sun Apr 27 00:54:47 2025 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 27 Apr 2025 00:54:47 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <__a0S14OZpuwaQX6jFAlJhZJy2fZ9flb_1LW9gUD_sI=.b2b7851a-0aed-421b-9755-89fc63e404da@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> <__a0S14OZpuwaQX6jFAlJhZJy2fZ9flb_1LW9gUD_sI=.b2b7851a-0aed-421b-9755-89fc63e404da@github.com> Message-ID: On Sat, 26 Apr 2025 23:35:01 GMT, Igor Veresov wrote: >> test/hotspot/jtreg/runtime/cds/appcds/aotProfile/AOTProfileFlags.java line 30: >> >>> 28: * @requires vm.cds >>> 29: * @comment work around JDK-8345635 >>> 30: * @requires !vm.jvmci.enabled >> >> Consider adding: >> >> * @requires vm.cds.supports.aot.class.linking >> * @requires vm.flagless > > Could you please explain why? @iklam, what do you think? I think it's OK to test without these two additional flags. This will make sure that the two diagnostic flags don't have any bad side effect even if AOT class linking is disabled (due to flags like -XX:+UseZGC). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061892882 From iklam at openjdk.org Sun Apr 27 01:06:00 2025 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 27 Apr 2025 01:06:00 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sun, 27 Apr 2025 00:02:41 GMT, Igor Veresov wrote: >> src/hotspot/share/cds/dumpAllocStats.hpp line 151: >> >>> 149: } >>> 150: >>> 151: void record_dynamic_proxy_class() { >> >> This is not called. This code seems not related. > > True. @iklam, this came with a change you wanted me to take. Ok to cut this out? Yes, this part is not needed. AOT dynamic proxy classes are only supported in the Leyden repo. >> src/hotspot/share/oops/methodData.cpp line 1296: >> >>> 1294: >>> 1295: MethodData::MethodData() { >>> 1296: assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); >> >> 1. Should its code be guarded by `#if INCLUDE_CDS`? >> 2. Comment where/how it is used. >> 3. Is it used in all phases or only during TRAINING and ASSEMBLY? >> 4. Can you add query methods into `CDSConfig` which you can call here and in other places?: >> >> is_dumping_training_data() >> is_using_training_data() > > I think those are used for CDS serialization/deserialization, right, @iklam? This constructor is used by cppVtables.cpp to calculate the size of the vtables for MethodData, and also for finding the address of the vtable of MethodData. All types in `CPP_VTABLE_TYPES_DO` must have such an empty constructor. E.g., `InstanceKlass::InstanceKlass()`. We have not been very consistent with comments around these constructors, but I think we can do this: #if INCLUDE_CDS MethodData::MethodData() { // Used by cppVtables.cpp only assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); } #endif This method is called even if we are not dumping training data. The vtables of all types in `CPP_VTABLE_TYPES_DO` are unconditionally computed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061901400 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061898996 From kvn at openjdk.org Sun Apr 27 01:14:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 27 Apr 2025 01:14:46 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> <__a0S14OZpuwaQX6jFAlJhZJy2fZ9flb_1LW9gUD_sI=.b2b7851a-0aed-421b-9755-89fc63e404da@github.com> Message-ID: On Sun, 27 Apr 2025 00:52:37 GMT, Ioi Lam wrote: >> Could you please explain why? @iklam, what do you think? > > I think it's OK to test without these two additional flags. This will make sure that the two diagnostic flags don't have any bad side effect even if AOT class linking is disabled (due to flags like -XX:+UseZGC). I see them in `aotClassLinking/AOTClassLinkingVMOptions.java` test. @veresov please, add link to your mach5 testing to JBS. And add to description what tests you already ran. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061907156 From kvn at openjdk.org Sun Apr 27 01:19:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 27 Apr 2025 01:19:46 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sun, 27 Apr 2025 00:24:31 GMT, Igor Veresov wrote: >> src/hotspot/share/ci/ciInstanceKlass.hpp line 47: >> >>> 45: friend class ciField; >>> 46: friend class ciReplay; >>> 47: friend class CompileTrainingData; >> >> Not referenced here > > It allows `CompileTrainingData` to peek into the `ciInstanceKlass` internals. We need the klass ptr specially. I missed that it is "friend" declaration. >> src/hotspot/share/ci/ciMethodData.cpp line 71: >> >>> 69: >>> 70: bool is_live(Method* m) { >>> 71: Klass* holder = m->method_holder(); >> >> Changes in this file seems not related and can be pushed/tested separately. If they are related - there should be condition for additional checks. > > You mean you want these checks to be done only if `TrainingData::have_data() == true` ? Yes, if it is related. Otherwise you may change default behavior when Leyden code is not used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061910254 PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061908809 From kvn at openjdk.org Sun Apr 27 01:23:52 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 27 Apr 2025 01:23:52 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 23:58:04 GMT, Igor Veresov wrote: >> src/hotspot/share/oops/trainingData.cpp line 76: >> >>> 74: >>> 75: static void verify_archived_entry(TrainingData* td, const TrainingData::Key* k) { >>> 76: guarantee(TrainingData::Key::can_compute_cds_hash(k), ""); >> >> Should we gracefully disable using TD instead of crashing VM? > > But this is a verification code. That seems to be the usual strategy, is it not? I thought if we can not use AOT cache we issue warning and continue execution without it. Unless you are checking for damaged AOT cache which may affect execution without it. @iklam what do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061910742 From iveresov at openjdk.org Sun Apr 27 01:39:55 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 01:39:55 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sun, 27 Apr 2025 01:20:45 GMT, Vladimir Kozlov wrote: >> But this is a verification code. That seems to be the usual strategy, is it not? > > I thought if we can not use AOT cache we issue warning and continue execution without it. > Unless you are checking for damaged AOT cache which may affect execution without it. > > @iklam what do you think? But this runs only with `-XX:+VerifyTrainingData`. It's a testing mode. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061918596 From iveresov at openjdk.org Sun Apr 27 01:51:46 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 01:51:46 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: <2ks6GsBxC4Ig0uyg6ufSmzxbU4pvbYcnRWF2IuB0-ac=.954c69b1-e8f3-4e69-abd3-0147ba798471@github.com> On Sat, 26 Apr 2025 21:30:25 GMT, Vladimir Kozlov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > src/hotspot/share/oops/methodData.cpp line 1296: > >> 1294: >> 1295: MethodData::MethodData() { >> 1296: assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); > > 1. Should its code be guarded by `#if INCLUDE_CDS`? > 2. Comment where/how it is used. > 3. Is it used in all phases or only during TRAINING and ASSEMBLY? > 4. Can you add query methods into `CDSConfig` which you can call here and in other places?: > > is_dumping_training_data() > is_using_training_data() @vnkozlov Are you ok with Ioi's code change proposal? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061936106 From kvn at openjdk.org Sun Apr 27 01:59:45 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 27 Apr 2025 01:59:45 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sun, 27 Apr 2025 01:37:33 GMT, Igor Veresov wrote: >> I thought if we can not use AOT cache we issue warning and continue execution without it. >> Unless you are checking for damaged AOT cache which may affect execution without it. >> >> @iklam what do you think? > > But this runs only with `-XX:+VerifyTrainingData`. It's a testing mode. I missed that. Okay then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061939418 From kvn at openjdk.org Sun Apr 27 02:03:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 27 Apr 2025 02:03:46 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <2ks6GsBxC4Ig0uyg6ufSmzxbU4pvbYcnRWF2IuB0-ac=.954c69b1-e8f3-4e69-abd3-0147ba798471@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> <2ks6GsBxC4Ig0uyg6ufSmzxbU4pvbYcnRWF2IuB0-ac=.954c69b1-e8f3-4e69-abd3-0147ba798471@github.com> Message-ID: <_unPFSSYWqe0ldilT96MjUkRFTkiPsYhWkLvWNPj0Ek=.1745d82b-22e3-4fbe-9b46-6d0b3aa177f0@github.com> On Sun, 27 Apr 2025 01:49:20 GMT, Igor Veresov wrote: >> src/hotspot/share/oops/methodData.cpp line 1296: >> >>> 1294: >>> 1295: MethodData::MethodData() { >>> 1296: assert(CDSConfig::is_dumping_static_archive() || UseSharedSpaces, "only for CDS"); >> >> 1. Should its code be guarded by `#if INCLUDE_CDS`? >> 2. Comment where/how it is used. >> 3. Is it used in all phases or only during TRAINING and ASSEMBLY? >> 4. Can you add query methods into `CDSConfig` which you can call here and in other places?: >> >> is_dumping_training_data() >> is_using_training_data() > > @vnkozlov Are you ok with Ioi's code change proposal? Yes, I am ok. I assume you will change accordingly all other similar empty constructors in your code: `MethodCounters()` and `*TrainingData()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061941118 From iklam at openjdk.org Sun Apr 27 02:22:56 2025 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 27 Apr 2025 02:22:56 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> <__a0S14OZpuwaQX6jFAlJhZJy2fZ9flb_1LW9gUD_sI=.b2b7851a-0aed-421b-9755-89fc63e404da@github.com> Message-ID: On Sun, 27 Apr 2025 01:11:39 GMT, Vladimir Kozlov wrote: >> I think it's OK to test without these two additional flags. This will make sure that the two diagnostic flags don't have any bad side effect even if AOT class linking is disabled (due to flags like -XX:+UseZGC). > > I see them in `aotClassLinking/AOTClassLinkingVMOptions.java` test. > @veresov please, add link to your mach5 testing to JBS. And add to description what tests you already ran. OK, let's add those two `@requires` to be consistent with other tests. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2061952289 From iveresov at openjdk.org Sun Apr 27 05:20:46 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 27 Apr 2025 05:20:46 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling In-Reply-To: <_unPFSSYWqe0ldilT96MjUkRFTkiPsYhWkLvWNPj0Ek=.1745d82b-22e3-4fbe-9b46-6d0b3aa177f0@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> <2ks6GsBxC4Ig0uyg6ufSmzxbU4pvbYcnRWF2IuB0-ac=.954c69b1-e8f3-4e69-abd3-0147ba798471@github.com> <_unPFSSYWqe0ldilT96MjUkRFTkiPsYhWkLvWNPj0Ek=.1745d82b-22e3-4fbe-9b46-6d0b3aa177f0@github.com> Message-ID: On Sun, 27 Apr 2025 02:00:48 GMT, Vladimir Kozlov wrote: >> @vnkozlov Are you ok with Ioi's code change proposal? > > Yes, I am ok. I assume you will change accordingly all other similar empty constructors in your code: `MethodCounters()` and `*TrainingData()`. Yes, will do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2062292557 From jwaters at openjdk.org Sun Apr 27 15:06:46 2025 From: jwaters at openjdk.org (Julian Waters) Date: Sun, 27 Apr 2025 15:06:46 GMT Subject: RFR: 8355617: Remove historical debug_only macro in favor of DEBUG_ONLY In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 15:17:17 GMT, Aleksey Shipilev wrote: > I noticed in `macros.hpp`: > > > #ifdef ASSERT > #define DEBUG_ONLY(code) code > #define NOT_DEBUG(code) > #define NOT_DEBUG_RETURN /*next token must be ;*/ > // Historical. > #define debug_only(code) code > #else // ASSERT > > > There are 350+ instances of `debug_only`, and 1600+ instances of `DEBUG_ONLY`. We can cleanup `debug_only` and rewrite all uses to `DEBUG_ONLY`. > > I think we can do this in one shot. This was a mechanical rename. I have looked through the `grep -R debug_only src/ | nl`, and there are no hits anymore. > > Additional testing: > - [x] GHA Looks good ------------- Marked as reviewed by jwaters (Committer). PR Review: https://git.openjdk.org/jdk/pull/24879#pullrequestreview-2797664795 From kevinw at openjdk.org Sun Apr 27 20:59:45 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Sun, 27 Apr 2025 20:59:45 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v3] In-Reply-To: <2eKkbh51jTro6ENnv5g06ZxUkdYp-kLO2b64x2TWAuI=.a6c737c6-70c1-4bfd-b2ab-6762949e55f7@github.com> References: <2eKkbh51jTro6ENnv5g06ZxUkdYp-kLO2b64x2TWAuI=.a6c737c6-70c1-4bfd-b2ab-6762949e55f7@github.com> Message-ID: <_vXQwkdfHuFT1wfI2Z8GnR2fazaLof-gMPcMEOKo7aI=.9451dd92-1893-4f08-aa45-3924951c8bba@github.com> On Fri, 25 Apr 2025 16:27:13 GMT, Chris Plummer wrote: >> Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: >> >> REMOVE_TIMEOUT_SECONDS > > test/jdk/sun/jvmstat/monitor/MonitoredVm/MonitorVmStartTerminate.java line 78: > >> 76: private static final int PROCESS_COUNT = 10; >> 77: private static final int ARGS_ATTEMPTS = 5; >> 78: private static final int REMOVE_TIMEOUT_SECONDS = 300; > > I'm still not convinced we need a large timeout here. The default timeout is 120 seconds, so jtreg will time out the test before waitForRemoval() times out itself. Yes -- Maybe just take out this extra timeout. I'm being too respectful of things that have been in the test forever, but aren't of much use. Some tests have their own timeouts, when jtreg can just time them out. Checking again, the timeouts I've seen from this tests are from the jtreg harness, not the loop that had a 1000 second timeout, or even when I change it, it's still jtreg giving the timeout. Should remove it. ARGS_ATTEMPTS could be even longer. If it completely fails to read a process arguments, we are at risk of failing later due to timeout. We may as well try harder, this is the part that might fail and need retrying when the processes are slow to start. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24843#discussion_r2062720323 From kevinw at openjdk.org Sun Apr 27 21:05:34 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Sun, 27 Apr 2025 21:05:34 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v4] In-Reply-To: References: Message-ID: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: remove file removal timeout, rely on jtreg timeout ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24843/files - new: https://git.openjdk.org/jdk/pull/24843/files/9423fc01..a1074954 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=02-03 Stats: 10 lines in 1 file changed: 0 ins; 9 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24843.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24843/head:pull/24843 PR: https://git.openjdk.org/jdk/pull/24843 From iveresov at openjdk.org Mon Apr 28 04:22:32 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Mon, 28 Apr 2025 04:22:32 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v2] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with two additional commits since the last revision: - Remove the workaround of setting AOTRecordTraining during assembly - Address some of the review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/3ec132e7..fea6171e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=00-01 Stats: 96 lines in 9 files changed: 30 ins; 49 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From cjplummer at openjdk.org Mon Apr 28 04:31:46 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 28 Apr 2025 04:31:46 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v3] In-Reply-To: <_vXQwkdfHuFT1wfI2Z8GnR2fazaLof-gMPcMEOKo7aI=.9451dd92-1893-4f08-aa45-3924951c8bba@github.com> References: <2eKkbh51jTro6ENnv5g06ZxUkdYp-kLO2b64x2TWAuI=.a6c737c6-70c1-4bfd-b2ab-6762949e55f7@github.com> <_vXQwkdfHuFT1wfI2Z8GnR2fazaLof-gMPcMEOKo7aI=.9451dd92-1893-4f08-aa45-3924951c8bba@github.com> Message-ID: On Sun, 27 Apr 2025 20:56:59 GMT, Kevin Walls wrote: >> test/jdk/sun/jvmstat/monitor/MonitoredVm/MonitorVmStartTerminate.java line 78: >> >>> 76: private static final int PROCESS_COUNT = 10; >>> 77: private static final int ARGS_ATTEMPTS = 5; >>> 78: private static final int REMOVE_TIMEOUT_SECONDS = 300; >> >> I'm still not convinced we need a large timeout here. The default timeout is 120 seconds, so jtreg will time out the test before waitForRemoval() times out itself. > > Yes -- Maybe just take out this extra timeout. I'm being too respectful of things that have been in the test forever, but aren't of much use. Some tests have their own timeouts, when jtreg can just time them out. > > Checking again, the timeouts I've seen from this tests are from the jtreg harness, not the loop that had a 1000 second timeout, or even when I change it, it's still jtreg giving the timeout. Should remove it. > > ARGS_ATTEMPTS could be even longer. If it completely fails to read a process arguments, we are at risk of failing later due to timeout. We may as well try harder, this is the part that might fail and need retrying when the processes are slow to start. Yes, I think the issue is that hasMainArgs() does not succeed after ARGS_ATTEMPTS (3), which probably means 30 seconds given the 10 second attach timeout. Once this happens, waitForRemoval() will timeout no matter how long you wait. Setting ARGS_ATTEMPTS to 5 might be enough to alleviate this problem, although you might want to go with something more like 10 just to make sure we can eliminate this as the cause if it comes up again. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24843#discussion_r2062869665 From rvansa at openjdk.org Mon Apr 28 07:44:04 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 28 Apr 2025 07:44:04 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: > This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . > > This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). > > In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. > > My measurements on the attached reproducer > > hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC > Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] > Range (min ? max): 45.1 ms ? 53.9 ms 100 runs > > hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC > Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] > Range (min ? max): 73.8 ms ? 79.7 ms 100 runs > > (the jdk25-master above already contains JDK-8353175) > > hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' > Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC > Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] > Range (min ? max): 37.7 ms ? 42.1 ms 100 runs > > While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: > > JDK 17: 1.6 s > JDK 21 (no patches): 22 s > JDK25-master: 12.3 s > JDK25-this-pr: 0.5 s Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: - Fix VerifyRawIndexesTest - Fix reordering in layout and annotations - Use qsort_r for different platforms ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24847/files - new: https://git.openjdk.org/jdk/pull/24847/files/fe798710..ef69ec06 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=01-02 Stats: 89 lines in 7 files changed: 67 ins; 14 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/24847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847 PR: https://git.openjdk.org/jdk/pull/24847 From rvansa at openjdk.org Mon Apr 28 07:44:04 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Mon, 28 Apr 2025 07:44:04 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 28 Apr 2025 07:41:23 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Fix VerifyRawIndexesTest > - Fix reordering in layout and annotations > - Use qsort_r for different platforms src/hotspot/share/oops/fieldInfo.hpp line 290: > 288: static int compare_symbols(const Symbol *s1, const Symbol *s2); > 289: > 290: static Array* create_FieldInfoStream(ConstantPool* constants, GrowableArray* fields, int java_fields, int injected_fields, In the latest form the ConstantPool parameter is used only for assertion, though I think that it is rather important assertion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2060206085 From shade at openjdk.org Mon Apr 28 08:46:10 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 28 Apr 2025 08:46:10 GMT Subject: RFR: 8355617: Remove historical debug_only macro in favor of DEBUG_ONLY In-Reply-To: References: Message-ID: <-qi-IEW4FBKxla0_NspwEjOUc5F-nxGbVMPhFSA0IyQ=.01d6456f-0c73-437a-a860-5b9dbca2c1d5@github.com> On Fri, 25 Apr 2025 15:17:17 GMT, Aleksey Shipilev wrote: > I noticed in `macros.hpp`: > > > #ifdef ASSERT > #define DEBUG_ONLY(code) code > #define NOT_DEBUG(code) > #define NOT_DEBUG_RETURN /*next token must be ;*/ > // Historical. > #define debug_only(code) code > #else // ASSERT > > > There are 350+ instances of `debug_only`, and 1600+ instances of `DEBUG_ONLY`. We can cleanup `debug_only` and rewrite all uses to `DEBUG_ONLY`. > > I think we can do this in one shot. This was a mechanical rename. I have looked through the `grep -R debug_only src/ | nl`, and there are no hits anymore. > > Additional testing: > - [x] GHA Thanks for reviews! I re-merged with current master locally. There were no conflicts, and no backsliding of `debug_only` either. Light testing (`hotspot:tier1`) still passes. So I think we are good to go. Here goes: ------------- PR Comment: https://git.openjdk.org/jdk/pull/24879#issuecomment-2834467628 From shade at openjdk.org Mon Apr 28 08:46:11 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 28 Apr 2025 08:46:11 GMT Subject: Integrated: 8355617: Remove historical debug_only macro in favor of DEBUG_ONLY In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 15:17:17 GMT, Aleksey Shipilev wrote: > I noticed in `macros.hpp`: > > > #ifdef ASSERT > #define DEBUG_ONLY(code) code > #define NOT_DEBUG(code) > #define NOT_DEBUG_RETURN /*next token must be ;*/ > // Historical. > #define debug_only(code) code > #else // ASSERT > > > There are 350+ instances of `debug_only`, and 1600+ instances of `DEBUG_ONLY`. We can cleanup `debug_only` and rewrite all uses to `DEBUG_ONLY`. > > I think we can do this in one shot. This was a mechanical rename. I have looked through the `grep -R debug_only src/ | nl`, and there are no hits anymore. > > Additional testing: > - [x] GHA This pull request has now been integrated. Changeset: db6fa592 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/db6fa5923cd0394dfb44c7e46c3e7ccc102a933a Stats: 371 lines in 119 files changed: 0 ins; 3 del; 368 mod 8355617: Remove historical debug_only macro in favor of DEBUG_ONLY Reviewed-by: stefank, kbarrett, jwaters ------------- PR: https://git.openjdk.org/jdk/pull/24879 From iveresov at openjdk.org Mon Apr 28 08:48:14 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Mon, 28 Apr 2025 08:48:14 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v3] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Fix class filtering ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/fea6171e..2a490c66 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=01-02 Stats: 7 lines in 2 files changed: 5 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From kevinw at openjdk.org Mon Apr 28 08:50:47 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 28 Apr 2025 08:50:47 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v3] In-Reply-To: References: <2eKkbh51jTro6ENnv5g06ZxUkdYp-kLO2b64x2TWAuI=.a6c737c6-70c1-4bfd-b2ab-6762949e55f7@github.com> <_vXQwkdfHuFT1wfI2Z8GnR2fazaLof-gMPcMEOKo7aI=.9451dd92-1893-4f08-aa45-3924951c8bba@github.com> Message-ID: On Mon, 28 Apr 2025 04:27:03 GMT, Chris Plummer wrote: >> Yes -- Maybe just take out this extra timeout. I'm being too respectful of things that have been in the test forever, but aren't of much use. Some tests have their own timeouts, when jtreg can just time them out. >> >> Checking again, the timeouts I've seen from this tests are from the jtreg harness, not the loop that had a 1000 second timeout, or even when I change it, it's still jtreg giving the timeout. Should remove it. >> >> ARGS_ATTEMPTS could be even longer. If it completely fails to read a process arguments, we are at risk of failing later due to timeout. We may as well try harder, this is the part that might fail and need retrying when the processes are slow to start. > > Yes, I think the issue is that hasMainArgs() does not succeed after ARGS_ATTEMPTS (3), which probably means 30 seconds given the 10 second attach timeout. Once this happens, waitForRemoval() will timeout no matter how long you wait. Setting ARGS_ATTEMPTS to 5 might be enough to alleviate this problem, although you might want to go with something more like 10 just to make sure we can eliminate this as the cause if it comes up again. Great, that's exactly what I did with this update 8-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24843#discussion_r2063205340 From iveresov at openjdk.org Mon Apr 28 09:15:28 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Mon, 28 Apr 2025 09:15:28 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v4] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: - Merge branch 'master' into pp2 - Fix class filtering - Remove the workaround of setting AOTRecordTraining during assembly - Address some of the review comments - Merge branch 'master' into pp - Add AOTCompileEagerly flag to control compilation after clinit - Port 8355334: [leyden] Missing type profile info in archived training data - Port 8355296: [leyden] Some methods are stuck at level=0 with -XX:-TieredCompilation - Use ENABLE_IF macro - Missing part of the last commit - ... and 22 more: https://git.openjdk.org/jdk/compare/2447b981...7fb7ae62 ------------- Changes: https://git.openjdk.org/jdk/pull/24886/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=03 Stats: 3185 lines in 57 files changed: 2960 ins; 103 del; 122 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From jsikstro at openjdk.org Mon Apr 28 11:12:56 2025 From: jsikstro at openjdk.org (Joel =?UTF-8?B?U2lrc3Ryw7Zt?=) Date: Mon, 28 Apr 2025 11:12:56 GMT Subject: RFR: 8355692: Refactor stream indentation Message-ID: Hello, > To make it easier to review this PR, each commit that is prefixed with "FIX:" has the command/method I've used to verify that the output of prints is not changed/broken. The goal of this PR is to refactor the usage of streamIndentor and StreamAutoIndentor and combine them into a single StreamIndentor class to make it easier to apply indentation. [JDK-8354362](https://bugs.openjdk.org/browse/JDK-8354362) introduced a partial solution for the refactor I propose, by being able to add indentation with a StreamAutoIndentor. To use indentation, you currently need to do two separate things: 1) increment indentation and 2) apply indentation when printing. The indentation level can be incremented by using one of: streamIndentor, inc() or inc(int n), and automatic indentation can be enabled by using one of: StreamAutoIndentor, indent(), cr_indent(). I propose to make it easier to use and apply indentation by combining streamIndentor and StreamAutoIndentor into a new class called StreamIndentor. StreamIndentor has the functionality of both streamIndentor and StreamAutoIndentor, that is, enabling automatic indentation and also applying an optional amount of indentation. This means you only need to do one thing to make sure that indentation is incremented and applied, making it easier to use indentation. There are currently four (4) ways that indentation is applied in HotSpot: 1) The new method of enabling automatic indentation and applying indentation simultaneously (partially implemented already). Only in GC printing code via CollectedHeap. 2) Initially use a StreamAutoIndentor, then use streamIndentor to temporarily increment indentation. Indentation is automatically applied when printing. nmt/memReporter uses this principle, by having a StreamAutoIndentor as a member variable and applying indentation via streamIndentor when needed. 3) Use a streamIndentor and manually call indent() (or cr_indent()). Commonly used pattern. No need to manually apply indentation if automatically applied. 4) Increment/decrement indentation (using inc() and/or dec()) and manually call indent(). Only C1's CFGPrinterOutput does this. I leave the fourth case alone as it is fairly self-contained and requires a more extensive re-write, appropriate for a follow-up patch if that's what we want. In the third case, there is a possibility that there are cases where applying indentation automatically is undesirable. However, when refactoring and looking at the code I have not found any place where this is the case. All cases when using a streamIndentor and applying indentation manually using indent()/cr_indent() are able to be replaced by the new StreamIndentor without changing the output. After the changes in this PR there are only two ways that indentation is applied: 1) The new method of enabling automatic indentation and applying indentation simultaneously using StreamIndentor. 2) Increment/decrement indentation (using inc() and/or dec()) and manually call indent(). Only C1's CFGPrinterOutput does this. Some practical changes: * `outputStream::cr_indent()` has been removed since it is no longer used/needed. * `outputStream::set_autoindent()` is now private and StreamIndentor is a friend of outputStream to access this. * Made the indentation argument of StreamIndentor explicit, instead of the default of 2 which streamIndentor had. Sure, there are many cases which use an indentation of two, but I don't feel it's necessary to have this default only because it is common. If anything, making the argument explicit makes it easier to understand the code, in my opinion. * Renamed all uses of a StreamIndentor to `si[level]`. Testing: * GHA, Oracle's tier 1-4 * Manual inspection of printed content. See the commit message of commits prefixed with "FIX:". ------------- Commit messages: - Remove cr_indent() - Adapt gtest - Make outputStream::set_autoindent private - Rename to StreamIndentor si[level] - Combine streamIndentor and StreamAutoIndentor - FIX: memReporter - FIX: compilationMemoryStatistic.cpp - FIX: Trivial changes in compilationMemoryStatistic.cpp - FIX: metaspaceStatistics.cpp - FIX: printCLDMetaspaceInfoClosure.cpp - ... and 6 more: https://git.openjdk.org/jdk/compare/b41e0b17...a0c122ca Changes: https://git.openjdk.org/jdk/pull/24917/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24917&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355692 Stats: 195 lines in 34 files changed: 27 ins; 50 del; 118 mod Patch: https://git.openjdk.org/jdk/pull/24917.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24917/head:pull/24917 PR: https://git.openjdk.org/jdk/pull/24917 From erikj at openjdk.org Mon Apr 28 13:35:52 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Mon, 28 Apr 2025 13:35:52 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v4] In-Reply-To: References: Message-ID: <8Sz4UCfCW_5NTFSp3BuIe_V5F4Dh6aikcgdTOtwm18g=.c4d1fee3-c21a-4221-a075-dbebc7a5be82@github.com> On Mon, 28 Apr 2025 09:15:28 GMT, Igor Veresov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: > > - Merge branch 'master' into pp2 > - Fix class filtering > - Remove the workaround of setting AOTRecordTraining during assembly > - Address some of the review comments > - Merge branch 'master' into pp > - Add AOTCompileEagerly flag to control compilation after clinit > - Port 8355334: [leyden] Missing type profile info in archived training data > - Port 8355296: [leyden] Some methods are stuck at level=0 with -XX:-TieredCompilation > - Use ENABLE_IF macro > - Missing part of the last commit > - ... and 22 more: https://git.openjdk.org/jdk/compare/2447b981...7fb7ae62 Build changes look trivially good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24886#issuecomment-2835265306 From duke at openjdk.org Mon Apr 28 15:57:22 2025 From: duke at openjdk.org (PAWAN CHAWDHARY) Date: Mon, 28 Apr 2025 15:57:22 GMT Subject: RFR: 8354475: TestDockerMemoryMetricsSubgroup.java fails with exitValue = 1 Message-ID: <9N622Jo_e1ORLj-OmaYEWBss5S59JAHhEvEhMFmIt6A=.b950e4d8-bbc0-4e71-bc98-4c90aac9ce18@github.com> 8354475: TestDockerMemoryMetricsSubgroup.java fails with exitValue = 1 ------------- Commit messages: - 8354475: TestDockerMemoryMetricsSubgroup.java fails with exitValue = 1 Changes: https://git.openjdk.org/jdk/pull/24930/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24930&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354475 Stats: 18 lines in 1 file changed: 17 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24930.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24930/head:pull/24930 PR: https://git.openjdk.org/jdk/pull/24930 From duke at openjdk.org Mon Apr 28 15:57:22 2025 From: duke at openjdk.org (PAWAN CHAWDHARY) Date: Mon, 28 Apr 2025 15:57:22 GMT Subject: RFR: 8354475: TestDockerMemoryMetricsSubgroup.java fails with exitValue = 1 In-Reply-To: <9N622Jo_e1ORLj-OmaYEWBss5S59JAHhEvEhMFmIt6A=.b950e4d8-bbc0-4e71-bc98-4c90aac9ce18@github.com> References: <9N622Jo_e1ORLj-OmaYEWBss5S59JAHhEvEhMFmIt6A=.b950e4d8-bbc0-4e71-bc98-4c90aac9ce18@github.com> Message-ID: On Mon, 28 Apr 2025 15:51:44 GMT, PAWAN CHAWDHARY wrote: > 8354475: TestDockerMemoryMetricsSubgroup.java fails with exitValue = 1 TestDockerMemoryMetricsSubgroup.java creates directory under /sys/fs/cgroup by running following command for this test needs to be run as root mode `mkdir -p /sys/fs/cgroup/memory/test` ------------- PR Comment: https://git.openjdk.org/jdk/pull/24930#issuecomment-2835722446 From cjplummer at openjdk.org Mon Apr 28 16:58:59 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 28 Apr 2025 16:58:59 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v4] In-Reply-To: References: Message-ID: On Sun, 27 Apr 2025 21:05:34 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > remove file removal timeout, rely on jtreg timeout Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24843#pullrequestreview-2800027799 From kvn at openjdk.org Mon Apr 28 17:37:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 28 Apr 2025 17:37:54 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v4] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 09:15:28 GMT, Igor Veresov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: > > - Merge branch 'master' into pp2 > - Fix class filtering > - Remove the workaround of setting AOTRecordTraining during assembly > - Address some of the review comments > - Merge branch 'master' into pp > - Add AOTCompileEagerly flag to control compilation after clinit > - Port 8355334: [leyden] Missing type profile info in archived training data > - Port 8355296: [leyden] Some methods are stuck at level=0 with -XX:-TieredCompilation > - Use ENABLE_IF macro > - Missing part of the last commit > - ... and 22 more: https://git.openjdk.org/jdk/compare/2447b981...7fb7ae62 Looks better. There are still places where UL is used specifically for TD processing. Consider using `(aot, training)` there instead of `(cds)`. ------------- PR Review: https://git.openjdk.org/jdk/pull/24886#pullrequestreview-2800131686 From iveresov at openjdk.org Mon Apr 28 18:20:48 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Mon, 28 Apr 2025 18:20:48 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v4] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 17:35:13 GMT, Vladimir Kozlov wrote: > Looks better. There are still places where UL is used specifically for TD processing. Consider using `(aot, training)` there instead of `(cds)`. Right. I haven't addressed these review comments yet. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24886#issuecomment-2836100964 From cjplummer at openjdk.org Mon Apr 28 19:13:07 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 28 Apr 2025 19:13:07 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v3] In-Reply-To: References: Message-ID: > As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. > > Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: found one more test to fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24867/files - new: https://git.openjdk.org/jdk/pull/24867/files/060c8d3d..a95698df Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24867&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24867&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24867/head:pull/24867 PR: https://git.openjdk.org/jdk/pull/24867 From cjplummer at openjdk.org Mon Apr 28 19:13:07 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 28 Apr 2025 19:13:07 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v3] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:09:59 GMT, Chris Plummer wrote: >> As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. >> >> Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > found one more test to fix test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/hashCode/hashcode001.java line 121: > 119: > 120: case 0: > 121: ThreadReference thread = debuggee.mainThread(); This one is not so obvious until you look earlier in the file and see: ``` private final static String methodName = "main";``` So `methodName` is "main". Probably should have always explicitly used "main" here instead, but with this change it's gone anyway. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24867#discussion_r2064350370 From kevinw at openjdk.org Mon Apr 28 19:22:47 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 28 Apr 2025 19:22:47 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v4] In-Reply-To: References: Message-ID: On Sun, 27 Apr 2025 21:05:34 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > remove file removal timeout, rely on jtreg timeout Thanks for all the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24843#issuecomment-2836271691 From lmesnik at openjdk.org Mon Apr 28 19:27:51 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 28 Apr 2025 19:27:51 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: References: Message-ID: <84x3k5P2AP_Qq0cd6kapPm_YO_WjvrIXdqQlsJgjahA=.c69d8206-0093-422a-8772-5ffb4c0af190@github.com> On Fri, 25 Apr 2025 06:17:38 GMT, Chris Plummer wrote: >> nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately return null. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. >> >> Tested by running all of nsk/jdi locally on linux-x64-debug > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > return null instead of throwing an exception Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24839#pullrequestreview-2800470710 From kevinw at openjdk.org Mon Apr 28 19:30:27 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 28 Apr 2025 19:30:27 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v5] In-Reply-To: References: Message-ID: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: remove unnecessary timeout related lines ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24843/files - new: https://git.openjdk.org/jdk/pull/24843/files/a1074954..bfa5bae8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24843&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24843.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24843/head:pull/24843 PR: https://git.openjdk.org/jdk/pull/24843 From ihse at openjdk.org Mon Apr 28 19:35:55 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 28 Apr 2025 19:35:55 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v4] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 09:15:28 GMT, Igor Veresov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: > > - Merge branch 'master' into pp2 > - Fix class filtering > - Remove the workaround of setting AOTRecordTraining during assembly > - Address some of the review comments > - Merge branch 'master' into pp > - Add AOTCompileEagerly flag to control compilation after clinit > - Port 8355334: [leyden] Missing type profile info in archived training data > - Port 8355296: [leyden] Some methods are stuck at level=0 with -XX:-TieredCompilation > - Use ENABLE_IF macro > - Missing part of the last commit > - ... and 22 more: https://git.openjdk.org/jdk/compare/2447b981...7fb7ae62 Build changes are trivially fine. ------------- Marked as reviewed by ihse (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24886#pullrequestreview-2800489948 From kevinw at openjdk.org Mon Apr 28 19:48:49 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Mon, 28 Apr 2025 19:48:49 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v5] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:30:27 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > remove unnecessary timeout related lines Oops, I noticed a couple of lines related to the timeout I removed, that also need removing, so will need a re-review ------------- PR Comment: https://git.openjdk.org/jdk/pull/24843#issuecomment-2836339566 From cjplummer at openjdk.org Mon Apr 28 20:24:46 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 28 Apr 2025 20:24:46 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v5] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:30:27 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > remove unnecessary timeout related lines Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24843#pullrequestreview-2800641600 From lmesnik at openjdk.org Mon Apr 28 20:48:46 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 28 Apr 2025 20:48:46 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v5] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:30:27 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > remove unnecessary timeout related lines Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24843#pullrequestreview-2800788144 From sspitsyn at openjdk.org Mon Apr 28 23:20:46 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 28 Apr 2025 23:20:46 GMT Subject: RFR: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 [v5] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:30:27 GMT, Kevin Walls wrote: >> This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. >> >> While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. >> This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > remove unnecessary timeout related lines Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24843#pullrequestreview-2801280386 From sspitsyn at openjdk.org Mon Apr 28 23:38:50 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 28 Apr 2025 23:38:50 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: References: Message-ID: <7kiR2u_hJ4_HBkh-ncyFB83LmBzYXO7Es2y-A10Oh64=.19b6846a-7417-4be0-8ec0-0135fa2a60cf@github.com> On Fri, 25 Apr 2025 06:17:38 GMT, Chris Plummer wrote: >> nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately return null. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. >> >> Tested by running all of nsk/jdi locally on linux-x64-debug > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > return null instead of throwing an exception test/hotspot/jtreg/vmTestbase/nsk/share/jdi/Debugee.java line 463: > 461: EventSet eventSet = eventQueue.remove(timeLeft); > 462: if (eventSet == null) { > 463: return null; This does not match the comment at the line 438: * Wait for the requested event and skip other events. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24839#discussion_r2065022560 From cjplummer at openjdk.org Tue Apr 29 01:46:45 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 29 Apr 2025 01:46:45 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: <7kiR2u_hJ4_HBkh-ncyFB83LmBzYXO7Es2y-A10Oh64=.19b6846a-7417-4be0-8ec0-0135fa2a60cf@github.com> References: <7kiR2u_hJ4_HBkh-ncyFB83LmBzYXO7Es2y-A10Oh64=.19b6846a-7417-4be0-8ec0-0135fa2a60cf@github.com> Message-ID: On Mon, 28 Apr 2025 23:35:49 GMT, Serguei Spitsyn wrote: >> Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: >> >> return null instead of throwing an exception > > test/hotspot/jtreg/vmTestbase/nsk/share/jdi/Debugee.java line 463: > >> 461: EventSet eventSet = eventQueue.remove(timeLeft); >> 462: if (eventSet == null) { >> 463: return null; > > This does not match the comment at the line 438: > > * Wait for the requested event and skip other events. What is implied here is waiting for "timeout" ms, not indefinitely. Nothing has changed in that regard as the method was always intended to return after the timeout expires. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24839#discussion_r2065207036 From kevinw at openjdk.org Tue Apr 29 07:00:53 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Tue, 29 Apr 2025 07:00:53 GMT Subject: Integrated: 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 08:53:42 GMT, Kevin Walls wrote: > This issue was hard to reproduce or had stopped happening, but was likely a failure to capture the main arguments from a monitored process, caused by slow startups e.g. -Xcomp. Recent failures seen in Graal testing are most likely the same issue and increasing ARGS_ATTEMPTS has been seen to resolve it. > > While updating the test, fix the very verbose logging: it doesn't need to log, every second, how many nanos it has waited. > This has made logs from timeouts hardly usable as they are truncated and may miss critical info. Also, group the "tunable" constants together. This pull request has now been integrated. Changeset: 841989b2 Author: Kevin Walls URL: https://git.openjdk.org/jdk/commit/841989b2701b4ee0ec9be03d8007e6788edf56b4 Stats: 20 lines in 1 file changed: 6 ins; 12 del; 2 mod 8318730: MonitorVmStartTerminate.java still times out after JDK-8209595 Reviewed-by: lmesnik, sspitsyn, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/24843 From cnorrbin at openjdk.org Tue Apr 29 09:48:05 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Tue, 29 Apr 2025 09:48:05 GMT Subject: RFR: 8241678: Remove PerfData sampling via StatSampler [v2] In-Reply-To: References: Message-ID: > Hi everyone, > > This change removes the legacy `PerfData` sampling mechanism implemented through the `StatSampler` ? an always-on periodic task that runs every 50ms my default. The sampling feature was originally introduced to collect performance counters and timestamps, but has since seen very little use. > > For G1/ZGC, the only sampled value is a timestamp (`sun.os.hrt.ticks`). For Serial/Parallel, it also samples some heap space counters, but these are already updated after each GC cycle, making the sampling redundant. With sampling removed, the `PerfDataSamplingInterval` flag becomes obsoleted, as it no longer serves any purpose. > > The only thing relying on the sampled timestamps is `jstat`: running `jstat -t` prints an extra column with the time since VM start. To preserve this funcitonality, we can calculate the timestamps as an offset from the already existing `sun.rt.createVmBeginTime` instead. Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: feedback fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24872/files - new: https://git.openjdk.org/jdk/pull/24872/files/7f8141ba..ed3670eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24872&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24872&range=00-01 Stats: 28 lines in 3 files changed: 1 ins; 5 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/24872.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24872/head:pull/24872 PR: https://git.openjdk.org/jdk/pull/24872 From cnorrbin at openjdk.org Tue Apr 29 09:53:48 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Tue, 29 Apr 2025 09:53:48 GMT Subject: RFR: 8241678: Remove PerfData sampling via StatSampler [v2] In-Reply-To: <30koYWQ8Z8s6wId_9EmUxpAmyeBVeOYEP-u-nKzm_OQ=.10a36bef-4f65-4b53-a526-a0dc60b18de9@github.com> References: <30koYWQ8Z8s6wId_9EmUxpAmyeBVeOYEP-u-nKzm_OQ=.10a36bef-4f65-4b53-a526-a0dc60b18de9@github.com> Message-ID: On Sat, 26 Apr 2025 11:31:39 GMT, Albert Mingkun Yang wrote: >> Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: >> >> feedback fixes > > There are still a few matches for "hrt.ticks"; don't know if they should be removed (in this PR or a followup). Thank you for reviewing @albertnetymk! I'll look into the last traces of "hrt.ticks" to see if they can be removed here. > src/hotspot/share/runtime/perfData.cpp line 455: > >> 453: assert(value != nullptr, "property name should be have a value: %s", name); >> 454: assert_system_property(name, value, CHECK); >> 455: if (value != nullptr) { > > Why checking null again? Didn't we just asserted that 2 lines above? This was from the original moved function, but I agree its redundant. Removed it now. > src/hotspot/share/runtime/threads.cpp line 852: > >> 850: #endif // INCLUDE_MANAGEMENT >> 851: >> 852: PerfDataManager::create_misc_perfdata(); > > Should this be guarded by `UsePerfData`? It should, thank you for spotting. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24872#issuecomment-2838158557 PR Review Comment: https://git.openjdk.org/jdk/pull/24872#discussion_r2065957329 PR Review Comment: https://git.openjdk.org/jdk/pull/24872#discussion_r2065955553 From ayang at openjdk.org Tue Apr 29 10:00:52 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 29 Apr 2025 10:00:52 GMT Subject: RFR: 8241678: Remove PerfData sampling via StatSampler [v2] In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 09:48:05 GMT, Casper Norrbin wrote: >> Hi everyone, >> >> This change removes the legacy `PerfData` sampling mechanism implemented through the `StatSampler` ? an always-on periodic task that runs every 50ms my default. The sampling feature was originally introduced to collect performance counters and timestamps, but has since seen very little use. >> >> For G1/ZGC, the only sampled value is a timestamp (`sun.os.hrt.ticks`). For Serial/Parallel, it also samples some heap space counters, but these are already updated after each GC cycle, making the sampling redundant. With sampling removed, the `PerfDataSamplingInterval` flag becomes obsoleted, as it no longer serves any purpose. >> >> The only thing relying on the sampled timestamps is `jstat`: running `jstat -t` prints an extra column with the time since VM start. To preserve this funcitonality, we can calculate the timestamps as an offset from the already existing `sun.rt.createVmBeginTime` instead. > > Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision: > > feedback fixes Reviewed only `hotspot` changes. Not familiar with other parts. ------------- Marked as reviewed by ayang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24872#pullrequestreview-2802821239 From duke at openjdk.org Tue Apr 29 13:10:19 2025 From: duke at openjdk.org (PAWAN CHAWDHARY) Date: Tue, 29 Apr 2025 13:10:19 GMT Subject: RFR: 8352926: New test TestDockerMemoryMetricsSubgroup.java fails Message-ID: 8352926: New test TestDockerMemoryMetricsSubgroup.java fails ------------- Commit messages: - update comment - 8352926: New test TestDockerMemoryMetricsSubgroup.java fails Changes: https://git.openjdk.org/jdk/pull/24948/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24948&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8352926 Stats: 157 lines in 2 files changed: 157 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24948/head:pull/24948 PR: https://git.openjdk.org/jdk/pull/24948 From duke at openjdk.org Tue Apr 29 14:45:25 2025 From: duke at openjdk.org (PAWAN CHAWDHARY) Date: Tue, 29 Apr 2025 14:45:25 GMT Subject: RFR: 8352926: New test TestDockerMemoryMetricsSubgroup.java fails [v2] In-Reply-To: References: Message-ID: > 8352926: New test TestDockerMemoryMetricsSubgroup.java fails PAWAN CHAWDHARY has updated the pull request incrementally with one additional commit since the last revision: update reference of DockerVersion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24948/files - new: https://git.openjdk.org/jdk/pull/24948/files/b2c2cc43..223a5d1b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24948&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24948&range=00-01 Stats: 9 lines in 1 file changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/24948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24948/head:pull/24948 PR: https://git.openjdk.org/jdk/pull/24948 From cjplummer at openjdk.org Tue Apr 29 17:46:27 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 29 Apr 2025 17:46:27 GMT Subject: RFR: 8355773: Some nsk/jdi tests can fetch ThreadReference from static field in the debuggee Message-ID: In an effort go get rid of calls to Debugee.threadByName() or Debugee.threadByNameOrThrow(), I found that many tests store the thread being looked up in a static field of the debuggee. The test can fetch the ThreadReference from the static field instead of looking it up using APIs that rely on vm.allThreads(). Most of the changes take advantage of the following common pattern: In the debugger: String threadName1 = "thread1"; thread1 = debuggee.threadByNameOrThrow(threadName1); In the debuggee: static Thread thread1 = null; thread1 = JDIThreadFactory.newThread(new Thread1addcountfilter001a("thread1")); Note that the static field name for the Thread is the same as the thread name. Thus we can easily switch from looking up by thread name to instead looking up by static field name since they both use the same name. ------------- Commit messages: - update copyright - update copyright - fix type in last commit - undo hashcode001 change - fix jcheck errors - Use static field in the debuggee to look up ThreadReference. Changes: https://git.openjdk.org/jdk/pull/24935/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24935&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355773 Stats: 79 lines in 31 files changed: 34 ins; 1 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/24935.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24935/head:pull/24935 PR: https://git.openjdk.org/jdk/pull/24935 From cjplummer at openjdk.org Tue Apr 29 17:46:27 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 29 Apr 2025 17:46:27 GMT Subject: RFR: 8355773: Some nsk/jdi tests can fetch ThreadReference from static field in the debuggee In-Reply-To: References: Message-ID: <_Qu-D0wP4JGYH-Tx5xTDGwl0GdItSS2jW9P86W0AQu4=.5a94ca56-937f-40ac-b4c5-f231fd283acd@github.com> On Mon, 28 Apr 2025 18:27:26 GMT, Chris Plummer wrote: > In an effort go get rid of calls to Debugee.threadByName() or Debugee.threadByNameOrThrow(), I found that many tests store the thread being looked up in a static field of the debuggee. The test can fetch the ThreadReference from the static field instead of looking it up using APIs that rely on vm.allThreads(). > > Most of the changes take advantage of the following common pattern: > > In the debugger: > > String threadName1 = "thread1"; > thread1 = debuggee.threadByNameOrThrow(threadName1); > > In the debuggee: > > static Thread thread1 = null; > thread1 = JDIThreadFactory.newThread(new Thread1addcountfilter001a("thread1")); > > Note that the static field name for the Thread is the same as the thread name. Thus we can easily switch from looking up by thread name to instead looking up by static field name since they both use the same name. test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequest/hashCode/hashcode001.java line 121: > 119: > 120: case 0: > 121: ThreadReference thread = debuggee.mainThread(); This is one that was missed during the PR for #24867. test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequestManager/createStepRequest/crstepreq003.java line 81: > 79: > 80: static final int lineForBreakInThread = 141; > 81: static final int[] checkedLines = { 142, 142, 182 }; Note the added lines in the debuggee below. test/hotspot/jtreg/vmTestbase/nsk/jdi/EventRequestManager/createStepRequest/crstepreq004.java line 82: > 80: static final int lineForBreakInThread = 149; > 81: static final int[] checkedLines = { 163, 163, 196 }; > 82: static final int[] checkedLinesAlt = { 164, 164, 196 }; Note the added lines in the debuggee below. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24935#discussion_r2064317075 PR Review Comment: https://git.openjdk.org/jdk/pull/24935#discussion_r2064320290 PR Review Comment: https://git.openjdk.org/jdk/pull/24935#discussion_r2064321920 From pchilanomate at openjdk.org Tue Apr 29 20:20:49 2025 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 29 Apr 2025 20:20:49 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v4] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 06:41:42 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added general comment about sync between suspend_thread and resume_thread Hi Serguei, The fix looks correct but I think we could probably do something that avoids this overhead and also simplifies the code. Looking at each suspend case we have, in terms of lack of sync between self-suspend and resume: - unmounted vthread: should not be an issue, unmounted implies it?s not self suspend. - platform thread (not carrying vthread): Currently we have an issue because we are setting `_carrier_thread_suspended` always, not only for the carrying vthread case. We then assume that `_carrier_thread_suspended`=true implies suspension, but that will only be the case for the carrying vthread case which checks for this and suspends in `finish_VTMS_transition`. A normal platform thread needs to still call handshake code to suspend. This is the issue with this test because a resumer sees the target is already suspended and calls resume before the target actually calls handshake suspend. We should just set `_carrier_thread_suspended` on the carrying vthread case. - platform thread carrying vthread: we can keep using `_carrier_thread_suspended` but should use atomic cmpxchg when updating. - mounted vthread: Currently we have an issue because we are doing two things that are not atomic, adding/removing to the list and calling handshake code to suspend/resume. We could move the update of the list inside the handshake instead. I?ve been experimenting with the above changes here: https://github.com/pchilano/jdk/commit/7cce48ef14bdb977b9c65a33370144f3d61fc974 Notice that the `suspend_thread/resume_thread` methods are simpler to read because the cases are clearly separated. This would also fix the lack of sync between self-suspend and suspend, which we still have if we just change resume to be a handshake. What do you think? The `JvmtiVTMSTransitionDisabler` changes could be done in a separate fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2840140925 From schernyshev at openjdk.org Tue Apr 29 20:32:52 2025 From: schernyshev at openjdk.org (Sergey Chernyshev) Date: Tue, 29 Apr 2025 20:32:52 GMT Subject: RFR: 8352926: New test TestDockerMemoryMetricsSubgroup.java fails [v2] In-Reply-To: References: Message-ID: On Tue, 29 Apr 2025 14:45:25 GMT, PAWAN CHAWDHARY wrote: >> 8352926: New test TestDockerMemoryMetricsSubgroup.java fails > > PAWAN CHAWDHARY has updated the pull request incrementally with one additional commit since the last revision: > > update reference of DockerVersion test/hotspot/jtreg/containers/docker/TestMemoryWithSubgroups.java line 74: > 72: return; > 73: } > 74: if (IS_DOCKER && TestDockerMemoryMetricsSubgroup.DockerVersion.VERSION_20_10_0.compareTo(getDockerVersion()) > 0) { Should this change also cover Podman, in which the minimum version that supports `--cgroupns` is 3.0 ? test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetricsSubgroup.java line 149: > 147: } > 148: > 149: private static class DockerVersion implements Comparable { The same code is added to multiple tests. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24948#discussion_r2067344723 PR Review Comment: https://git.openjdk.org/jdk/pull/24948#discussion_r2067371874 From sspitsyn at openjdk.org Tue Apr 29 22:41:51 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 29 Apr 2025 22:41:51 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v3] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:13:07 GMT, Chris Plummer wrote: >> As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. >> >> Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > found one more test to fix This looks okay. I've not found problems so far. Will do one more pass. ------------- PR Review: https://git.openjdk.org/jdk/pull/24867#pullrequestreview-2805224656 From sspitsyn at openjdk.org Tue Apr 29 22:49:47 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 29 Apr 2025 22:49:47 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 06:17:38 GMT, Chris Plummer wrote: >> nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately return null. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. >> >> Tested by running all of nsk/jdi locally on linux-x64-debug > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > return null instead of throwing an exception Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24839#pullrequestreview-2805259636 From sspitsyn at openjdk.org Tue Apr 29 22:49:48 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 29 Apr 2025 22:49:48 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: References: <7kiR2u_hJ4_HBkh-ncyFB83LmBzYXO7Es2y-A10Oh64=.19b6846a-7417-4be0-8ec0-0135fa2a60cf@github.com> Message-ID: On Tue, 29 Apr 2025 01:44:22 GMT, Chris Plummer wrote: >> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/Debugee.java line 463: >> >>> 461: EventSet eventSet = eventQueue.remove(timeLeft); >>> 462: if (eventSet == null) { >>> 463: return null; >> >> This does not match the comment at the line 438: >> >> * Wait for the requested event and skip other events. > > What is implied here is waiting for "timeout" ms, not indefinitely. Nothing has changed in that regard as the method was always intended to return after the timeout expires. Okay, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24839#discussion_r2067542009 From sspitsyn at openjdk.org Tue Apr 29 23:00:54 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 29 Apr 2025 23:00:54 GMT Subject: RFR: 8355773: Some nsk/jdi tests can fetch ThreadReference from static field in the debuggee In-Reply-To: References: Message-ID: <8nGQnrMnwT8zm8RmsB3YhjVX7q35M1hsTxmTkWkEshI=.b6c1a74c-49b1-4b69-abfe-f21ae4b35892@github.com> On Mon, 28 Apr 2025 18:27:26 GMT, Chris Plummer wrote: > In an effort go get rid of calls to Debugee.threadByName() or Debugee.threadByNameOrThrow(), I found that many tests store the thread being looked up in a static field of the debuggee. The test can fetch the ThreadReference from the static field instead of looking it up using APIs that rely on vm.allThreads(). > > Most of the changes take advantage of the following common pattern: > > In the debugger: > > String threadName1 = "thread1"; > thread1 = debuggee.threadByNameOrThrow(threadName1); > > In the debuggee: > > static Thread thread1 = null; > thread1 = JDIThreadFactory.newThread(new Thread1addcountfilter001a("thread1")); > > Note that the static field name for the Thread is the same as the thread name. Thus we can easily switch from looking up by thread name to instead looking up by static field name since they both use the same name. Looks good. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24935#pullrequestreview-2805279197 From lmesnik at openjdk.org Tue Apr 29 23:47:50 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 29 Apr 2025 23:47:50 GMT Subject: Integrated: 8355069: Allocation::check_out_of_memory() should support CheckUnhandledOops mode In-Reply-To: References: Message-ID: On Sat, 19 Apr 2025 00:39:00 GMT, Leonid Mesnik wrote: > The > CheckUnhandledOops > cause failure if JvmtiExport::post_resource_exhausted(...) > is called in > MemAllocator::Allocation::check_out_of_memory() > The obj is null so it is not a real bug. > > I am fixing it to reduce noise for CheckUnhandledOops mode for jvmti tests execution. > The vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted002/TestDescription.java > failed with -XX:+CheckUnhandledOops > > If define are unwelcome here, the > ``` PreserveObj obj_h(_thread, _obj_ptr);``` > might be added instead with comment why it is needed for null obj. This pull request has now been integrated. Changeset: 83d0bd85 Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/83d0bd85afaf1b5724c12f4d2f6e9c7087bab4e8 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod 8355069: Allocation::check_out_of_memory() should support CheckUnhandledOops mode Reviewed-by: sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/24766 From amenkov at openjdk.org Wed Apr 30 00:32:45 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 30 Apr 2025 00:32:45 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v3] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:13:07 GMT, Chris Plummer wrote: >> As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. >> >> Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > found one more test to fix Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24867#pullrequestreview-2805431965 From amenkov at openjdk.org Wed Apr 30 00:47:45 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 30 Apr 2025 00:47:45 GMT Subject: RFR: 8355773: Some nsk/jdi tests can fetch ThreadReference from static field in the debuggee In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 18:27:26 GMT, Chris Plummer wrote: > In an effort go get rid of calls to Debugee.threadByName() or Debugee.threadByNameOrThrow(), I found that many tests store the thread being looked up in a static field of the debuggee. The test can fetch the ThreadReference from the static field instead of looking it up using APIs that rely on vm.allThreads(). > > Most of the changes take advantage of the following common pattern: > > In the debugger: > > String threadName1 = "thread1"; > thread1 = debuggee.threadByNameOrThrow(threadName1); > > In the debuggee: > > static Thread thread1 = null; > thread1 = JDIThreadFactory.newThread(new Thread1addcountfilter001a("thread1")); > > Note that the static field name for the Thread is the same as the thread name. Thus we can easily switch from looking up by thread name to instead looking up by static field name since they both use the same name. Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24935#pullrequestreview-2805450985 From amenkov at openjdk.org Wed Apr 30 01:29:53 2025 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 30 Apr 2025 01:29:53 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: References: Message-ID: <877c4wF6d-plMXENz3K5lsuW0d0TfqcMFnZBBQsWm1A=.77b5fd56-2f07-4f98-b1dd-ec325f52f771@github.com> On Fri, 25 Apr 2025 06:17:38 GMT, Chris Plummer wrote: >> nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately return null. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. >> >> Tested by running all of nsk/jdi locally on linux-x64-debug > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > return null instead of throwing an exception Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24839#pullrequestreview-2805495983 From iveresov at openjdk.org Wed Apr 30 05:39:04 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 30 Apr 2025 05:39:04 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v5] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: - Address review comments part 2 - Merge branch 'master' into pp2 - Merge branch 'master' into pp2 - Fix class filtering - Remove the workaround of setting AOTRecordTraining during assembly - Address some of the review comments - Merge branch 'master' into pp - Add AOTCompileEagerly flag to control compilation after clinit - Port 8355334: [leyden] Missing type profile info in archived training data - Port 8355296: [leyden] Some methods are stuck at level=0 with -XX:-TieredCompilation - ... and 24 more: https://git.openjdk.org/jdk/compare/6850757f...4514d032 ------------- Changes: https://git.openjdk.org/jdk/pull/24886/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=04 Stats: 3197 lines in 57 files changed: 2972 ins; 103 del; 122 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From iveresov at openjdk.org Wed Apr 30 05:39:04 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 30 Apr 2025 05:39:04 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v5] In-Reply-To: References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: <2b3hUB0FEH3l0nU-C6rkeihJwHawxzfkHI3UVbGfO88=.dc289b11-8c83-4473-96fb-40381c986c59@github.com> On Sun, 27 Apr 2025 01:15:54 GMT, Vladimir Kozlov wrote: >> You mean you want these checks to be done only if `TrainingData::have_data() == true` ? > > Yes, if it is related. Otherwise you may change default behavior when Leyden code is not used. I simplified it and factored out the new semantics. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2067883564 From rvansa at openjdk.org Wed Apr 30 10:18:51 2025 From: rvansa at openjdk.org (Radim Vansa) Date: Wed, 30 Apr 2025 10:18:51 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 28 Apr 2025 07:44:04 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Fix VerifyRawIndexesTest > - Fix reordering in layout and annotations > - Use qsort_r for different platforms @fparain Hi, could I bring your attention here? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2841496084 From kevinw at openjdk.org Wed Apr 30 11:57:27 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 30 Apr 2025 11:57:27 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows Message-ID: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> This is hard to reproduce, and at first I'd only seen -1 returned on the first calls to mbean.getProcessCpuLoad(). But eventually I observed a -1 at any time, including in middle of the iterations, or on the last iteration which makes the current test fail. Should fail on Windows only if we only ever see -1 returned from getProcessCpuLoad(). Remove the "exclusiveAccess.dirs=." (JDK-8353231 adding "exclusiveAccess.dirs=." did not fix this.) ------------- Commit messages: - 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows Changes: https://git.openjdk.org/jdk/pull/24961/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24961&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354407 Stats: 35 lines in 2 files changed: 5 ins; 24 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24961.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24961/head:pull/24961 PR: https://git.openjdk.org/jdk/pull/24961 From rehn at openjdk.org Wed Apr 30 12:00:47 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 30 Apr 2025 12:00:47 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v3] In-Reply-To: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> References: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> Message-ID: On Mon, 31 Mar 2025 10:45:54 GMT, Robbin Ehn wrote: >> Hi, for you to consider. >> >> These tests constantly fails in qemu-user. >> Either the require host to be same arch explicit or implicit (sysroot). >> E.g. "ptrace(PTRACE_ATTACH, ..) failed for 405157: Function not implemented'" for SA tests. >> >> From bug: >>> qemu-user/rv64 sets uarch to "qemu" in /proc/cpuinfo (qemu-system do not do that). >>> We add this uarch to CPU feature string. >>> This means we can use jtreg 'require' with cpu string to filter out tests in qemu-user. >> >> Relevant qemu code: >> https://github.com/qemu/qemu/blob/170825d14d88a1ce7fae98d5a928480f2f329b22/linux-user/riscv/target_proc.h#L29 >> >> Relevant hotspot code: >> https://github.com/openjdk/jdk/blob/fa0b18bfde38ee2ffbab33a9eaac547fe8aa3c7c/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L250 >> >> Tested that the require only filters out tests in qemu+riscv64. >> >> Thanks! >> >> /Robbin > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into qemu-user-issues > - Revert > - Merge branch 'master' into qemu-user-issues > - Merge branch 'master' into qemu-user-issues > - more > - more > - native or very long I don't see any alternatives - if this approach is not liked, any suggestions? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2841743196 From rehn at openjdk.org Wed Apr 30 12:00:45 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 30 Apr 2025 12:00:45 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v4] In-Reply-To: References: Message-ID: > Hi, for you to consider. > > These tests constantly fails in qemu-user. > Either the require host to be same arch explicit or implicit (sysroot). > E.g. "ptrace(PTRACE_ATTACH, ..) failed for 405157: Function not implemented'" for SA tests. > > From bug: >> qemu-user/rv64 sets uarch to "qemu" in /proc/cpuinfo (qemu-system do not do that). >> We add this uarch to CPU feature string. >> This means we can use jtreg 'require' with cpu string to filter out tests in qemu-user. > > Relevant qemu code: > https://github.com/qemu/qemu/blob/170825d14d88a1ce7fae98d5a928480f2f329b22/linux-user/riscv/target_proc.h#L29 > > Relevant hotspot code: > https://github.com/openjdk/jdk/blob/fa0b18bfde38ee2ffbab33a9eaac547fe8aa3c7c/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L250 > > Tested that the require only filters out tests in qemu+riscv64. > > Thanks! > > /Robbin Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' into qemu-user-issues - Merge branch 'master' into qemu-user-issues - Revert - Merge branch 'master' into qemu-user-issues - Merge branch 'master' into qemu-user-issues - more - more - native or very long ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24229/files - new: https://git.openjdk.org/jdk/pull/24229/files/73968ab8..5668fe13 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24229&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24229&range=02-03 Stats: 302323 lines in 2847 files changed: 92525 ins; 199451 del; 10347 mod Patch: https://git.openjdk.org/jdk/pull/24229.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24229/head:pull/24229 PR: https://git.openjdk.org/jdk/pull/24229 From fyang at openjdk.org Wed Apr 30 12:25:55 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 30 Apr 2025 12:25:55 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v4] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 12:00:45 GMT, Robbin Ehn wrote: >> Hi, for you to consider. >> >> These tests constantly fails in qemu-user. >> Either the require host to be same arch explicit or implicit (sysroot). >> E.g. "ptrace(PTRACE_ATTACH, ..) failed for 405157: Function not implemented'" for SA tests. >> >> From bug: >>> qemu-user/rv64 sets uarch to "qemu" in /proc/cpuinfo (qemu-system do not do that). >>> We add this uarch to CPU feature string. >>> This means we can use jtreg 'require' with cpu string to filter out tests in qemu-user. >> >> Relevant qemu code: >> https://github.com/qemu/qemu/blob/170825d14d88a1ce7fae98d5a928480f2f329b22/linux-user/riscv/target_proc.h#L29 >> >> Relevant hotspot code: >> https://github.com/openjdk/jdk/blob/fa0b18bfde38ee2ffbab33a9eaac547fe8aa3c7c/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L250 >> >> Tested that the require only filters out tests in qemu+riscv64. >> >> Thanks! >> >> /Robbin > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into qemu-user-issues > - Merge branch 'master' into qemu-user-issues > - Revert > - Merge branch 'master' into qemu-user-issues > - Merge branch 'master' into qemu-user-issues > - more > - more > - native or very long Thanks for finding these tests! ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24229#pullrequestreview-2806869108 From fyang at openjdk.org Wed Apr 30 12:25:56 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 30 Apr 2025 12:25:56 GMT Subject: RFR: 8352730: RISC-V: Disable tests in qemu-user [v3] In-Reply-To: References: <5sujqD7L_cmLUyDwYb4PhgOlEeiFwlkAV7RJoVMFTrM=.223437cd-bbb2-4ef3-a6fe-b13ce402e14b@github.com> Message-ID: <0WJqqpLhGRgq64qQ21_NsZG8JAKTqllk5d8Zevg9hBs=.347c68b1-8284-4095-b6e5-010e47bce796@github.com> On Wed, 30 Apr 2025 11:57:32 GMT, Robbin Ehn wrote: > I don't see any alternatives - if this approach is not liked, any suggestions? The changes seems fine to me. It's just a pity that this would only work for riscv for now. I guess we don't have a better choice given current status of qemu-user and the setting of jvm cpu string on different CPU platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24229#issuecomment-2841798807 From fparain at openjdk.org Wed Apr 30 13:30:45 2025 From: fparain at openjdk.org (Frederic Parain) Date: Wed, 30 Apr 2025 13:30:45 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: <5qFhMngklidPXVkmJEYiv_dk2z3J2DYcyVzqnm4uJ6U=.146e221f-bec9-4203-908e-17c04e352532@github.com> On Wed, 30 Apr 2025 10:16:06 GMT, Radim Vansa wrote: > @fparain Hi, could I bring your attention here? Thanks! Sure, I'll take a look at it. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2841972655 From cjplummer at openjdk.org Wed Apr 30 17:12:51 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 17:12:51 GMT Subject: RFR: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly [v2] In-Reply-To: References: Message-ID: <1UF5MgIhY-ufmCh4DRqyVru4ZrbZnvAl_ytaNkkJmiQ=.2cf91534-48cd-4fcb-b614-e66b8c8d4214@github.com> On Fri, 25 Apr 2025 06:17:38 GMT, Chris Plummer wrote: >> nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately return null. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. >> >> Tested by running all of nsk/jdi locally on linux-x64-debug > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > return null instead of throwing an exception Thank you for the reviews Leonid, Serguei, and Alex! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24839#issuecomment-2842683217 From cjplummer at openjdk.org Wed Apr 30 17:12:52 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 17:12:52 GMT Subject: Integrated: 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly In-Reply-To: References: Message-ID: On Thu, 24 Apr 2025 05:28:09 GMT, Chris Plummer wrote: > nsk.share.jdi.Debugee.waitingEvent() takes a timeout argument. However, if the call to EventQueue.remove() times out and returns null, waitingEvent() just keep on retrying. The logic is incorrect for a null result. It should immediately return null. The retry path is only meant for filtering out other events, and it updates "timeLeft' before retrying to avoid endlessly repeating the loop. > > Tested by running all of nsk/jdi locally on linux-x64-debug This pull request has now been integrated. Changeset: 486acc06 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/486acc06e0325d247a96df8f7fc88c9111c3315d Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod 8355453: nsk.share.jdi.Debugee.waitingEvent() does not timeout properly Reviewed-by: lmesnik, amenkov, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/24839 From cjplummer at openjdk.org Wed Apr 30 17:18:51 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 17:18:51 GMT Subject: RFR: 8355773: Some nsk/jdi tests can fetch ThreadReference from static field in the debuggee In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 18:27:26 GMT, Chris Plummer wrote: > In an effort go get rid of calls to Debugee.threadByName() or Debugee.threadByNameOrThrow(), I found that many tests store the thread being looked up in a static field of the debuggee. The test can fetch the ThreadReference from the static field instead of looking it up using APIs that rely on vm.allThreads(). > > Most of the changes take advantage of the following common pattern: > > In the debugger: > > String threadName1 = "thread1"; > thread1 = debuggee.threadByNameOrThrow(threadName1); > > In the debuggee: > > static Thread thread1 = null; > thread1 = JDIThreadFactory.newThread(new Thread1addcountfilter001a("thread1")); > > Note that the static field name for the Thread is the same as the thread name. Thus we can easily switch from looking up by thread name to instead looking up by static field name since they both use the same name. Thank you for the review Alex and Serguei! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24935#issuecomment-2842705536 From cjplummer at openjdk.org Wed Apr 30 17:18:51 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 17:18:51 GMT Subject: Integrated: 8355773: Some nsk/jdi tests can fetch ThreadReference from static field in the debuggee In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 18:27:26 GMT, Chris Plummer wrote: > In an effort go get rid of calls to Debugee.threadByName() or Debugee.threadByNameOrThrow(), I found that many tests store the thread being looked up in a static field of the debuggee. The test can fetch the ThreadReference from the static field instead of looking it up using APIs that rely on vm.allThreads(). > > Most of the changes take advantage of the following common pattern: > > In the debugger: > > String threadName1 = "thread1"; > thread1 = debuggee.threadByNameOrThrow(threadName1); > > In the debuggee: > > static Thread thread1 = null; > thread1 = JDIThreadFactory.newThread(new Thread1addcountfilter001a("thread1")); > > Note that the static field name for the Thread is the same as the thread name. Thus we can easily switch from looking up by thread name to instead looking up by static field name since they both use the same name. This pull request has now been integrated. Changeset: 50145bb7 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/50145bb74ad87f5b3f80ed910f6ebb95e406b802 Stats: 78 lines in 31 files changed: 34 ins; 1 del; 43 mod 8355773: Some nsk/jdi tests can fetch ThreadReference from static field in the debuggee Reviewed-by: sspitsyn, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24935 From cjplummer at openjdk.org Wed Apr 30 17:47:45 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 17:47:45 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows In-Reply-To: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> References: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> Message-ID: <7ILfLmjYPgmU5h5c4J_tZIYppwR-EY534g47X4ZOGb8=.bb79b3b7-376f-414d-af1f-602202aaa8c5@github.com> On Wed, 30 Apr 2025 11:13:53 GMT, Kevin Walls wrote: > This is hard to reproduce, and at first I'd only seen -1 returned on the first calls to mbean.getProcessCpuLoad(). > But eventually I observed a -1 at any time, including in middle of the iterations, or on the last iteration which makes the current test fail. > > Should fail on Windows only if we only ever see -1 returned from getProcessCpuLoad(). > Remove the "exclusiveAccess.dirs=." (JDK-8353231 adding "exclusiveAccess.dirs=." did not fix this.) test/jdk/com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java line 60: > 58: // A good reading: forget any previous -1. > 59: ex = null; > 60: good++; The `ex = null` is a little misleading because actually it's irrelevant. An error on the next iteration will set it again, but it will be ignored because `good != 0`. test/jdk/com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java line 69: > 67: } > 68: > 69: if (good == 0 && ex != null) { The check for `ex != null` is not necessary. It should always be set if `good == 0` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24961#discussion_r2069176907 PR Review Comment: https://git.openjdk.org/jdk/pull/24961#discussion_r2069176999 From kevinw at openjdk.org Wed Apr 30 18:29:46 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 30 Apr 2025 18:29:46 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows In-Reply-To: <7ILfLmjYPgmU5h5c4J_tZIYppwR-EY534g47X4ZOGb8=.bb79b3b7-376f-414d-af1f-602202aaa8c5@github.com> References: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> <7ILfLmjYPgmU5h5c4J_tZIYppwR-EY534g47X4ZOGb8=.bb79b3b7-376f-414d-af1f-602202aaa8c5@github.com> Message-ID: On Wed, 30 Apr 2025 17:44:13 GMT, Chris Plummer wrote: >> This is hard to reproduce, and at first I'd only seen -1 returned on the first calls to mbean.getProcessCpuLoad(). >> But eventually I observed a -1 at any time, including in middle of the iterations, or on the last iteration which makes the current test fail. >> >> Should fail on Windows only if we only ever see -1 returned from getProcessCpuLoad(). >> Remove the "exclusiveAccess.dirs=." (JDK-8353231 adding "exclusiveAccess.dirs=." did not fix this.) > > test/jdk/com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java line 69: > >> 67: } >> 68: >> 69: if (good == 0 && ex != null) { > > The check for `ex != null` is not necessary. It should always be set if `good == 0` Right, it does not really need to set ex = null as as soon as there is something in "good", we pass. But leaving the Exception hanging around seems wrong. If good is zero then yes checking for ex is redundant. I left the check mostly to avoid any future change breaking it, but maybe it's best to expose that and make sure we fail if good is zero. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24961#discussion_r2069234063 From kevinw at openjdk.org Wed Apr 30 18:35:46 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 30 Apr 2025 18:35:46 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows In-Reply-To: <7ILfLmjYPgmU5h5c4J_tZIYppwR-EY534g47X4ZOGb8=.bb79b3b7-376f-414d-af1f-602202aaa8c5@github.com> References: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> <7ILfLmjYPgmU5h5c4J_tZIYppwR-EY534g47X4ZOGb8=.bb79b3b7-376f-414d-af1f-602202aaa8c5@github.com> Message-ID: On Wed, 30 Apr 2025 17:44:09 GMT, Chris Plummer wrote: >> This is hard to reproduce, and at first I'd only seen -1 returned on the first calls to mbean.getProcessCpuLoad(). >> But eventually I observed a -1 at any time, including in middle of the iterations, or on the last iteration which makes the current test fail. >> >> Should fail on Windows only if we only ever see -1 returned from getProcessCpuLoad(). >> Remove the "exclusiveAccess.dirs=." (JDK-8353231 adding "exclusiveAccess.dirs=." did not fix this.) > > test/jdk/com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java line 60: > >> 58: // A good reading: forget any previous -1. >> 59: ex = null; >> 60: good++; > > The `ex = null` is a little misleading because actually it's irrelevant. An error on the next iteration will set it again, but it will be ignored because `good != 0`. Yes, as long as there is a good value captured, Windows should pass. We could get into how many good values we should see, I might suggest 6 out of 10? But that seems like a guessing game. The breakage in this feature before meant it could never return a good value, that's what we need to guard against. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24961#discussion_r2069241694 From kevinw at openjdk.org Wed Apr 30 18:42:27 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 30 Apr 2025 18:42:27 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows [v2] In-Reply-To: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> References: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> Message-ID: > This is hard to reproduce, and at first I'd only seen -1 returned on the first calls to mbean.getProcessCpuLoad(). > But eventually I observed a -1 at any time, including in middle of the iterations, or on the last iteration which makes the current test fail. > > Should fail on Windows only if we only ever see -1 returned from getProcessCpuLoad(). > Remove the "exclusiveAccess.dirs=." (JDK-8353231 adding "exclusiveAccess.dirs=." did not fix this.) Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: No need to check ex at end, when failing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24961/files - new: https://git.openjdk.org/jdk/pull/24961/files/70cd88c3..d28b4c90 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24961&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24961&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24961.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24961/head:pull/24961 PR: https://git.openjdk.org/jdk/pull/24961 From cjplummer at openjdk.org Wed Apr 30 18:58:45 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 18:58:45 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows [v2] In-Reply-To: References: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> Message-ID: <1dRfqSsPbz7fSyE9FnyDF2KfDIPV8tRAbeQQLPv0xLo=.f8aa9452-4a2d-45a0-b564-5c45b5e1e8fd@github.com> On Wed, 30 Apr 2025 18:42:27 GMT, Kevin Walls wrote: >> This is hard to reproduce, and at first I'd only seen -1 returned on the first calls to mbean.getProcessCpuLoad(). >> But eventually I observed a -1 at any time, including in middle of the iterations, or on the last iteration which makes the current test fail. >> >> Should fail on Windows only if we only ever see -1 returned from getProcessCpuLoad(). >> Remove the "exclusiveAccess.dirs=." (JDK-8353231 adding "exclusiveAccess.dirs=." did not fix this.) > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > No need to check ex at end, when failing Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24961#pullrequestreview-2808110231 From sspitsyn at openjdk.org Wed Apr 30 19:51:49 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 30 Apr 2025 19:51:49 GMT Subject: RFR: 8316682: serviceability/jvmti/vthread/SelfSuspendDisablerTest timed out [v4] In-Reply-To: References: Message-ID: On Thu, 10 Apr 2025 06:41:42 GMT, Serguei Spitsyn wrote: >> This fixes the issue with lack of synchronization between JVMTI thread suspend and resume functions in a self-suspend case. More detailed fix description is in the first PR comment. >> >> Testing: Ran mach5 tiers 1-6. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added general comment about sync between suspend_thread and resume_thread Hi Patricio, Thank you for interesting analysis and suggestion! I think, it is worth to try. In fact, I did not want to bring that much knowledge about the JVMTI suspension state to the `HandshakeState` level but maybe is the smallest price we can pay to avoid more complexity. I'll try to go with the suggested changes, test them and then let you know the results. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24269#issuecomment-2843110235 From fparain at openjdk.org Wed Apr 30 20:14:46 2025 From: fparain at openjdk.org (Frederic Parain) Date: Wed, 30 Apr 2025 20:14:46 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 28 Apr 2025 07:44:04 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Fix VerifyRawIndexesTest > - Fix reordering in layout and annotations > - Use qsort_r for different platforms src/hotspot/share/oops/fieldInfo.hpp line 223: > 221: }; > 222: > 223: #define JUMP_TABLE_STRIDE 16 How was the threshold of 16 determined? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2069376257 From fparain at openjdk.org Wed Apr 30 20:19:48 2025 From: fparain at openjdk.org (Frederic Parain) Date: Wed, 30 Apr 2025 20:19:48 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Fri, 25 Apr 2025 13:05:51 GMT, Radim Vansa wrote: >> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: >> >> - Fix VerifyRawIndexesTest >> - Fix reordering in layout and annotations >> - Use qsort_r for different platforms > > src/hotspot/share/oops/fieldInfo.hpp line 290: > >> 288: static int compare_symbols(const Symbol *s1, const Symbol *s2); >> 289: >> 290: static Array* create_FieldInfoStream(ConstantPool* constants, GrowableArray* fields, int java_fields, int injected_fields, > > In the latest form the ConstantPool parameter is used only for assertion, though I think that it is rather important assertion. The ConstantPool parameter can be limited to debug builds (the ones with asserts) with the following patch: diff --git a/src/hotspot/share/classfile/classFileParser.cpp b/src/hotspot/share/classfile/classFileParser.cpp index 48646c0fb83..19471bbf7ee 100644 --- a/src/hotspot/share/classfile/classFileParser.cpp +++ b/src/hotspot/share/classfile/classFileParser.cpp @@ -5813,7 +5813,7 @@ void ClassFileParser::post_process_parsed_stream(const ClassFileStream* const st int injected_fields_count = _temp_field_info->length() - _java_fields_count; _fieldinfo_stream = - FieldInfoStream::create_FieldInfoStream(_cp, _temp_field_info, _java_fields_count, + FieldInfoStream::create_FieldInfoStream(DEBUG_ONLY(_cp COMMA) _temp_field_info, _java_fields_count, injected_fields_count, loader_data(), CHECK); _fields_status = MetadataFactory::new_array(_loader_data, _temp_field_info->length(), diff --git a/src/hotspot/share/classfile/javaClasses.cpp b/src/hotspot/share/classfile/javaClasses.cpp index f3fdf28b96b..80ee179576c 100644 --- a/src/hotspot/share/classfile/javaClasses.cpp +++ b/src/hotspot/share/classfile/javaClasses.cpp @@ -963,7 +963,7 @@ void java_lang_Class::fixup_mirror(Klass* k, TRAPS) { } Array* old_stream = ik->fieldinfo_stream(); assert(fields->length() == (java_fields + injected_fields), "Must be"); - Array* new_fis = FieldInfoStream::create_FieldInfoStream(ik->constants(), fields, java_fields, injected_fields, k->class_loader_data(), CHECK); + Array* new_fis = FieldInfoStream::create_FieldInfoStream(DEBUG_ONLY(ik->constants() COMMA) fields, java_fields, injected_fields, k->class_loader_data(), CHECK); ik->set_fieldinfo_stream(new_fis); MetadataFactory::free_array(k->class_loader_data(), old_stream); } diff --git a/src/hotspot/share/oops/fieldInfo.cpp b/src/hotspot/share/oops/fieldInfo.cpp index dd1fa11042d..eb90e6bdae8 100644 --- a/src/hotspot/share/oops/fieldInfo.cpp +++ b/src/hotspot/share/oops/fieldInfo.cpp @@ -66,7 +66,7 @@ int FieldInfoStream::compare_symbols(const Symbol *s1, const Symbol *s2) { } } -Array* FieldInfoStream::create_FieldInfoStream(ConstantPool* constants, GrowableArray* fields, int java_fields, int injected_fields, +Array* FieldInfoStream::create_FieldInfoStream(DEBUG_ONLY(ConstantPool* constants COMMA) GrowableArray* fields, int java_fields, int injected_fields, ClassLoaderData* loader_data, TRAPS) { // The stream format described in fieldInfo.hpp is: // FieldInfoStream := j=num_java_fields k=num_injected_fields JumpTable_offset(0/4 bytes) Field[j+k] JumpTable[(j - 1)/16 > 0] End diff --git a/src/hotspot/share/oops/fieldInfo.hpp b/src/hotspot/share/oops/fieldInfo.hpp index 1974e2e4117..b98d9230533 100644 --- a/src/hotspot/share/oops/fieldInfo.hpp +++ b/src/hotspot/share/oops/fieldInfo.hpp @@ -287,7 +287,7 @@ class FieldInfoStream : AllStatic { static int compare_symbols(const Symbol *s1, const Symbol *s2); - static Array* create_FieldInfoStream(ConstantPool* constants, GrowableArray* fields, int java_fields, int injected_fields, + static Array* create_FieldInfoStream(DEBUG_ONLY(ConstantPool* constants COMMA) GrowableArray* fields, int java_fields, int injected_fields, ClassLoaderData* loader_data, TRAPS); static GrowableArray* create_FieldInfoArray(const Array* fis, int* java_fields_count, int* injected_fields_count); static void print_from_fieldinfo_stream(Array* fis, outputStream* os, ConstantPool* cp); diff --git a/src/hotspot/share/prims/jvmtiRedefineClasses.cpp b/src/hotspot/share/prims/jvmtiRedefineClasses.cpp index 4d50d6cb637..46ffe767448 100644 --- a/src/hotspot/share/prims/jvmtiRedefineClasses.cpp +++ b/src/hotspot/share/prims/jvmtiRedefineClasses.cpp @@ -3555,7 +3555,7 @@ void VM_RedefineClasses::set_new_constant_pool( if (update_required) { Array* old_stream = scratch_class->fieldinfo_stream(); assert(fields->length() == (java_fields + injected_fields), "Must be"); - Array* new_fis = FieldInfoStream::create_FieldInfoStream(cp, fields, java_fields, injected_fields, scratch_class->class_loader_data(), CHECK); + Array* new_fis = FieldInfoStream::create_FieldInfoStream(DEBUG_ONLY(cp COMMA) fields, java_fields, injected_fields, scratch_class->class_loader_data(), CHECK); scratch_class->set_fieldinfo_stream(new_fis); MetadataFactory::free_array(scratch_class->class_loader_data(), old_stream); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2069379965 From fparain at openjdk.org Wed Apr 30 20:19:49 2025 From: fparain at openjdk.org (Frederic Parain) Date: Wed, 30 Apr 2025 20:19:49 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 28 Apr 2025 07:44:04 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Fix VerifyRawIndexesTest > - Fix reordering in layout and annotations > - Use qsort_r for different platforms src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Field.java line 115: > 113: int numFields = numJavaFields + numInjectedFields; > 114: // JumpTable is generated only for classes with > 16 (non-injected) fields > 115: if (numJavaFields > 16) { The test should use the `JUMP_TABLE_STRIDE` constant. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2069381206 From fparain at openjdk.org Wed Apr 30 20:23:51 2025 From: fparain at openjdk.org (Frederic Parain) Date: Wed, 30 Apr 2025 20:23:51 GMT Subject: RFR: 8352075: Perf regression accessing fields [v3] In-Reply-To: References: <0FXlc_4Zi2WDj-f3MVkUT4farzZJqvCP1CIgRVjbkK8=.3acf7aab-8cd8-494d-962a-340447efe39a@github.com> Message-ID: On Mon, 28 Apr 2025 07:44:04 GMT, Radim Vansa wrote: >> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 . >> >> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields). >> >> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either. >> >> My measurements on the attached reproducer >> >> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC >> Time (mean ? ?): 51.3 ms ? 2.8 ms [User: 44.7 ms, System: 13.7 ms] >> Range (min ? max): 45.1 ms ? 53.9 ms 100 runs >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC >> Time (mean ? ?): 78.2 ms ? 1.0 ms [User: 74.6 ms, System: 17.3 ms] >> Range (min ? max): 73.8 ms ? 79.7 ms 100 runs >> >> (the jdk25-master above already contains JDK-8353175) >> >> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC' >> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC >> Time (mean ? ?): 38.5 ms ? 0.5 ms [User: 34.4 ms, System: 17.3 ms] >> Range (min ? max): 37.7 ms ? 42.1 ms 100 runs >> >> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement: >> >> JDK 17: 1.6 s >> JDK 21 (no patches): 22 s >> JDK25-master: 12.3 s >> JDK25-this-pr: 0.5 s > > Radim Vansa has updated the pull request incrementally with three additional commits since the last revision: > > - Fix VerifyRawIndexesTest > - Fix reordering in layout and annotations > - Use qsort_r for different platforms The current patch triggered some failures in our internal tests. I'm investigating them. src/hotspot/share/oops/fieldInfo.cpp line 52: > 50: > 51: int FieldInfoStream::compare_symbols(const Symbol *s1, const Symbol *s2) { > 52: // not lexicographical sort, since we need only total ordering If only a total ordering is required, why defining a new method instead of reusing Symbol::fast_compare() ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2843192464 PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2069385261 From sspitsyn at openjdk.org Wed Apr 30 20:47:46 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 30 Apr 2025 20:47:46 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v3] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 19:13:07 GMT, Chris Plummer wrote: >> As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. >> >> Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > found one more test to fix Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24867#pullrequestreview-2808399256 From cjplummer at openjdk.org Wed Apr 30 20:55:48 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 20:55:48 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v3] In-Reply-To: References: Message-ID: <7BbFZljDv0H53V_2_O39b6soCbYsCJl-y6W6WsZyxmI=.97c2c258-f25f-4c84-ab54-51209a800f26@github.com> On Mon, 28 Apr 2025 19:13:07 GMT, Chris Plummer wrote: >> As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. >> >> Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. > > Chris Plummer has updated the pull request incrementally with one additional commit since the last revision: > > found one more test to fix Thank you for the reviews Serguei and Alex! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24867#issuecomment-2843252026 From iveresov at openjdk.org Wed Apr 30 20:56:32 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 30 Apr 2025 20:56:32 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v6] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Remove the proxy class counter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/4514d032..38a156f3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=04-05 Stats: 9 lines in 2 files changed: 0 ins; 9 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From iveresov at openjdk.org Wed Apr 30 21:00:58 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 30 Apr 2025 21:00:58 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v6] In-Reply-To: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> References: <3JsetDO0CDSY3TVxn9pC7YfbNq8-BDZ2UwNo38qJuOc=.e5b30111-294e-45ba-a9aa-cf8d09e26d45@github.com> Message-ID: On Sat, 26 Apr 2025 22:36:11 GMT, Vladimir Kozlov wrote: >> Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove the proxy class counter > > src/hotspot/share/cds/archiveBuilder.cpp line 770: > >> 768: relocate_embedded_pointers(&_rw_src_objs); >> 769: relocate_embedded_pointers(&_ro_src_objs); >> 770: log_info(cds)("Relocating %zu pointers, %zu tagged, %zu nulled", > > `log_info(aot)` if it is Leyden related. This more like a generic cds message. I'll leave this one as is. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24886#discussion_r2069431184 From lmesnik at openjdk.org Wed Apr 30 21:02:04 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 30 Apr 2025 21:02:04 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows [v2] In-Reply-To: References: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> <7ILfLmjYPgmU5h5c4J_tZIYppwR-EY534g47X4ZOGb8=.bb79b3b7-376f-414d-af1f-602202aaa8c5@github.com> Message-ID: On Wed, 30 Apr 2025 18:33:31 GMT, Kevin Walls wrote: >> test/jdk/com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java line 60: >> >>> 58: // A good reading: forget any previous -1. >>> 59: ex = null; >>> 60: good++; >> >> The `ex = null` is a little misleading because actually it's irrelevant. An error on the next iteration will set it again, but it will be ignored because `good != 0`. > > Yes, as long as there is a good value captured, Windows should pass. > We could get into how many good values we should see, I might suggest 6 out of 10? But that seems like a guessing game. > The breakage in this feature before meant it could never return a good value, that's what we need to guard against. I would simplify to for (int i = 0; i < TEST_COUNT; i++) { double load = mbean.getProcessCpuLoad(); if (load == -1.0 && Platform.isWindows()) { // Some Windows 2019 systems can return -1 for the first few reads. // Remember a -1 in case it never gets better. // Some Windows systems can return -1 occasionally, at any time. // Will fail if we never see good values. } else if (load < 0.0 || load > 1.0) { throw new RuntimeException("getProcessCpuLoad() returns " + load + " which is not in the [0.0,1.0] interval"); } else { // we got at least one load from 0.0 to 1.0, that's good to pass on Wiindows good++; } try { Thread.sleep(200); } } if (good == 0 && Platform.isWindows()) { // Never get any good results on Windows 2019 throw throw new RuntimeException("getProcessCpuLoad() returns always -1.0 on Windows in 10 attempts. "); } } does it makes a sense? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24961#discussion_r2069432691 From cjplummer at openjdk.org Wed Apr 30 21:07:01 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 21:07:01 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v4] In-Reply-To: References: Message-ID: <_CgjtsQO23d1Ce2RvbkVa9AwcLR06KlfQK0r5sfB28A=.b9d99396-435a-4993-af22-d72cef7bbf7a@github.com> > As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. > > Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. Chris Plummer has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge - found one more test to fix - glean main thread from ClassPrepareEvent - update copyrights - Get debuggee main thread from ClassPrepareEvent for debuggee main class. ------------- Changes: https://git.openjdk.org/jdk/pull/24867/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24867&range=03 Stats: 165 lines in 38 files changed: 75 ins; 15 del; 75 mod Patch: https://git.openjdk.org/jdk/pull/24867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24867/head:pull/24867 PR: https://git.openjdk.org/jdk/pull/24867 From iveresov at openjdk.org Wed Apr 30 21:14:16 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 30 Apr 2025 21:14:16 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v7] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Fix log tags ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/38a156f3..ef5dfcca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=05-06 Stats: 10 lines in 3 files changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From kevinw at openjdk.org Wed Apr 30 21:22:45 2025 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 30 Apr 2025 21:22:45 GMT Subject: RFR: 8354407: Test com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java still fails on Windows [v2] In-Reply-To: References: <72oWIJeRPYV1kgfqEXQpSnzzgEhKhIOGldcnauQH4pI=.e6b4ad7a-4430-41a2-bac3-42ebbfa5073c@github.com> <7ILfLmjYPgmU5h5c4J_tZIYppwR-EY534g47X4ZOGb8=.bb79b3b7-376f-414d-af1f-602202aaa8c5@github.com> Message-ID: <9EsAzO09Flx3F1L9XQ8pvk3vcpmvl5CDtlCdqT1hGL0=.e0ad1482-11cd-4527-b8a8-bb883796fdc0@github.com> On Wed, 30 Apr 2025 20:59:23 GMT, Leonid Mesnik wrote: >> Yes, as long as there is a good value captured, Windows should pass. >> We could get into how many good values we should see, I might suggest 6 out of 10? But that seems like a guessing game. >> The breakage in this feature before meant it could never return a good value, that's what we need to guard against. > > I would simplify to > > for (int i = 0; i < TEST_COUNT; i++) { > double load = mbean.getProcessCpuLoad(); > if (load == -1.0 && Platform.isWindows()) { > // Some Windows 2019 systems can return -1 for the first few reads. > // Remember a -1 in case it never gets better. > // Some Windows systems can return -1 occasionally, at any time. > // Will fail if we never see good values. > > } else if (load < 0.0 || load > 1.0) { > throw new RuntimeException("getProcessCpuLoad() returns " + load > + " which is not in the [0.0,1.0] interval"); > } else { > // we got at least one load from 0.0 to 1.0, that's good to pass on Wiindows > good++; > } > try { > Thread.sleep(200); > > } > } > > > if (good == 0 && Platform.isWindows()) { > // Never get any good results on Windows 2019 > throw throw new RuntimeException("getProcessCpuLoad() returns always -1.0 on Windows in 10 attempts. "); > } > } > > does it makes a sense? Thanks Leonid, yes we could do it like that. There have to be a load of ways we could arrange this. Yes, it could be a little simpler that what we have now, although it's not _that_ complicated now... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24961#discussion_r2069458654 From cjplummer at openjdk.org Wed Apr 30 22:55:51 2025 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 30 Apr 2025 22:55:51 GMT Subject: Integrated: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class In-Reply-To: References: Message-ID: On Fri, 25 Apr 2025 06:10:46 GMT, Chris Plummer wrote: > As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. > > Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. This pull request has now been integrated. Changeset: e2ae50d8 Author: Chris Plummer URL: https://git.openjdk.org/jdk/commit/e2ae50d877b13b121912e2496af4b5209b315a05 Stats: 165 lines in 38 files changed: 75 ins; 15 del; 75 mod 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class Reviewed-by: sspitsyn, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/24867 From iveresov at openjdk.org Wed Apr 30 22:58:09 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 30 Apr 2025 22:58:09 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v8] In-Reply-To: References: Message-ID: > Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. > > More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Fix flag behavior ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24886/files - new: https://git.openjdk.org/jdk/pull/24886/files/ef5dfcca..b937681e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24886&range=06-07 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24886.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24886/head:pull/24886 PR: https://git.openjdk.org/jdk/pull/24886 From sspitsyn at openjdk.org Wed Apr 30 22:45:48 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 30 Apr 2025 22:45:48 GMT Subject: RFR: 8355569: Some nsk/jdi tests can glean the "main" thread by using the ClassPrepareEvent for the debuggee main class [v4] In-Reply-To: <_CgjtsQO23d1Ce2RvbkVa9AwcLR06KlfQK0r5sfB28A=.b9d99396-435a-4993-af22-d72cef7bbf7a@github.com> References: <_CgjtsQO23d1Ce2RvbkVa9AwcLR06KlfQK0r5sfB28A=.b9d99396-435a-4993-af22-d72cef7bbf7a@github.com> Message-ID: On Wed, 30 Apr 2025 21:07:01 GMT, Chris Plummer wrote: >> As part of the work for [JDK-8353955](https://bugs.openjdk.org/browse/JDK-8353955) I am reducing the number of tests that need to be run with includevirtualhreads=y due to using JDI to lookup threads in the debuggee. There are many tests that lookup the "main" thread. They can instead glean the "main" thread from the ClassPrepareEvent for the debuggee main class. Some tests already wait for this ClassPrepareEvent, and can take advantage of it. Most do not, but can be made to do so. The easiest way to do this for many of the tests is to wait for the event in Debuggee.prepareDebuggee() after having called Debuggee.bindToDebuggee(). For tests that don't call Debuggee.prepareDebuggee(), waiting for the ClassPrepareEvent was added after the bind. This doesn't seem to have any ill affect on the tests. >> >> Tested by running all nsk/jdi tests, including all tier2, tier3, and tier5 nsk/jdi testing. > > Chris Plummer has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge > - found one more test to fix > - glean main thread from ClassPrepareEvent > - update copyrights > - Get debuggee main thread from ClassPrepareEvent for debuggee main class. Marked as reviewed by sspitsyn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24867#pullrequestreview-2808720702 From iveresov at openjdk.org Wed Apr 30 23:00:50 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Wed, 30 Apr 2025 23:00:50 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v4] In-Reply-To: References: Message-ID: On Mon, 28 Apr 2025 17:35:13 GMT, Vladimir Kozlov wrote: >> Igor Veresov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: >> >> - Merge branch 'master' into pp2 >> - Fix class filtering >> - Remove the workaround of setting AOTRecordTraining during assembly >> - Address some of the review comments >> - Merge branch 'master' into pp >> - Add AOTCompileEagerly flag to control compilation after clinit >> - Port 8355334: [leyden] Missing type profile info in archived training data >> - Port 8355296: [leyden] Some methods are stuck at level=0 with -XX:-TieredCompilation >> - Use ENABLE_IF macro >> - Missing part of the last commit >> - ... and 22 more: https://git.openjdk.org/jdk/compare/2447b981...7fb7ae62 > > Looks better. > There are still places where UL is used specifically for TD processing. Consider using `(aot, training)` there instead of `(cds)`. Vladimir (@vnkozlov), I did the changes that you requested. You please do another pass? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24886#issuecomment-2843621301 From kvn at openjdk.org Wed Apr 30 23:40:47 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 30 Apr 2025 23:40:47 GMT Subject: RFR: 8355003: Implement Ahead-of-Time Method Profiling [v8] In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 22:58:09 GMT, Igor Veresov wrote: >> Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the [AOT cache](https://openjdk.org/jeps/483) to store method execution profiles from training runs, reducing profiling delays in subsequent production runs. >> >> More details in the JEP: https://bugs.openjdk.org/browse/JDK-8325147 > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > Fix flag behavior Looks good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24886#pullrequestreview-2808829130